Methods and compositions for generating and amplifying dna libraries for sensitive detection and analysis of dna methylation

ABSTRACT

The present invention regards a variety of methods and compositions for obtaining epigenetic information, such as DNA methylation patterns, through the preparation, amplification and analysis of Methylome libraries. In particular, the method employs preparation of a DNA molecule by digesting the DNA molecule with at least one methylation-sensitive restriction enzyme; incorporating a nucleic acid molecule into at least some of the digested DNA molecules by either (1) incorporating at least one primer from a plurality of primers that have a 5′ constant sequence and a 3′ variable sequence, wherein the primers are substantially non-self-complementary and substantially non-complementary to other primers in the plurality; or (2) incorporating an oligonucleotide having an inverted repeat and a loop under conditions wherein the oligonucleotide becomes blunt-end ligated to one strand of the digested DNA molecule, followed by polymerization from a 3′ hydroxyl group present in a nick in the oligonucleotide-linked molecule; and amplifying one or more of the DNA molecules

The present application is a continuation of U.S. patent applicationSer. No. 13/859,034, filed Apr. 9, 2013, which is a continuation of U.S.patent application Ser. No. 11/071,864, filed Mar. 3, 2005, now U.S.Pat. No. 8,440,404, issued on May 14, 2013, which claims priority toU.S. Provisional Patent Application 60/551,941, filed Mar. 8, 2004, theentire contents of each are herein incorporated by reference in theirentirety.

Pursuant to 37 C.F.R. 1.821(c), a sequence listing is submitted herewithas an ASCII compliant text file named “RUBCP0023USC2_ST25.txt” createdon May 11, 2017 and having a size of ˜52 kilobytes. The content of theaforementioned file is hereby incorporated by reference in its entirety.

FIELD OF THE INVENTION

The present invention is directed to the fields of genomics, molecularbiology, the epigenetic control of gene expression, and moleculardiagnostics. In some embodiments, the present invention relates tomethods for amplification and identification of DNA fragmentssurrounding methylation sites. In other embodiments, the presentinvention relates to methods for amplifying and identifying sites thatare hypomethylated. In some embodiments, the present invention relatesto methods for the analysis of methylation of cytosine within the CpGdinucleotide in eukaryotic genomes and its implication in developmentalbiology, gene imprinting, and cancer diagnostics.

BACKGROUND OF THE INVENTION

Cytosine methylation occurs after DNA synthesis by enzymatic transfer ofa methyl group from an S-adenosylmethionine donor to the carbon-5position of cytosine. The enzymatic reaction is performed by one of afamily of enzymes known as DNA methyltransferases. The predominantsequence recognition motif for mammalian DNA methyltransferases is5′-CpG-3′, although non-CpG methylation has also been reported. Due tothe high rate of methyl cytosine to thymine transition mutations, theCpG dinucleotide is severely under-represented and unequally distributedacross the human genome. Vast stretches of DNA are depleted of CpGs, andthese are interspersed by CpG clusters known as CpG islands. About50-60% of known genes contain CpG islands in their promoter regions, andthey are maintained in a largely unmethylated state except in the casesof normal developmental gene expression control, gene imprinting, Xchromosome silencing, ageing, or aberrant methylation in cancer and someother pathological conditions. The patterns of DNA methylation are acritical point of interest for genomic studies of cancer, disease, andageing. Methylation of DNA has been investigated in terms of cellularmethylation patterns, global methylation patterns, and site-specificmethylation patterns. The goal of methylation analysis is to developdiscovery tools that increase our understanding of the mechanisms ofcancer progression, and diagnostic tools that allow the early detection,diagnosis, and treatment of cancers and other diseases. In recent yearsit has become apparent that the transcriptional silencing associatedwith 5-methylcytosine is important in mammalian development, genomeimprinting, X chromosome inactivation, mental health, and cancer, aswell as for protection against intragenomic parasites.

Methylation in Cancer

Epigenetics is the study of inherited changes in DNA structure thataffect expression of genes that are not due to a change in the DNAsequence. One major focus of epigenetic studies is the role ofmethylation in silencing gene expression. Both increased methylation(hypermethylation) and loss of methylation (hypomethylation) have beenimplicated in the development and progression of cancer and otherdiseases. Hypermethylation of gene promoter and upstream coding regionsresults in decreased expression of the corresponding genes. It has beenproposed that hypermethylation is used as a cellular mechanism to notonly decrease expression of genes not being utilized by the cell, butalso to silence transposons and other viral and bacterial genes thathave been incorporated into the genome. Genomic regions that areactively expressed within cells are often found to be hypomethylated inthe promoter and upstream coding regions. In contrast, downstreamregions are typically kept hypermethylated in actively transcribedgenes, but become hypomethylated in cancer (Jones and Baylin, 2002;Baylin and Herman, 2000). Thus, there appears to be a cellular balancebetween silencing of genes by hypermethylation and hypomethylation ofpromoter and upstream coding regions of genes that are actively beingexpressed.

Hypermethylation of tumor suppressor genes has been correlated with thedevelopment of many forms of cancer (Jain, 2003). The genes mostcommonly being hypermethylated in various cancers include: 14-3-3 sigma,ABL1 (P1), ABO, APC, AR (Androgen Receptor), BLT1 (Leukotriene B4Receptor), BRCA1, CALCA (Calcitonin), CASP8 (CASPASE 8), Caveolin 1,CD44, CDH1 (E-Cadherin), CFTR, GNAL, COX2, CSPG2 (Versican), CX26(Connexin 26), Cyclin A1, DAPK1, DBCCR1, DCIS-1, Endothelin Receptor B,EPHA3, EPO (Erythropoietin), ER (Estrogen Receptor), FHIT, GALNR2,GATA-3, COL9A1, GPC3 (Glypican 3), GST-pi, GTP-binding protein(olfactory subunit), H19, H-Cadherin (CDH13), HIC1, hMLH1, HOXAS, IGF2(Insulin-Like Growth Factor II), IGFBP7, IRF7, KAI1, LKB1, LRP-2(Megalin), MDGI (Mammary-derived growth inhibitor), MDR1, MDR3 (PGY3),MGMT (O6 methyl guanine methyl transferase), MINT, MT1a (metallothionein1), MUC2, MYOD1, N33, NEP (Neutral Endopeptidase 24.1)/CALLA, NF-L(light-neurofilament-encoding gene), NIS (Sodium-Iodide Symporter gene),OCT-6, P14/ARF, P15 (CDKN2B), P16 (CDKN2A), P27KIP1, p57 KIP2, p73,PAX6, PgR (Progesterone Receptor), RAR-Beta2, RASSF1, RB1(Retinoblastoma), RPA2 (replication protein A2), SIM2, TERT, TESTIN,TGFBR1, THBS1 (Thrombospondin-1), TIMP3, TLS3 (T-Plastin), TMEFF2,Urokinase (uPA), VHL (Von-Hippell Lindau), WT1, and ZO2 (Zona Occludens2).

While a small list of commonly hypermethylated sites are being routinelyscreened as potential sites of interest in many cancers, there is acurrent lack of methodologies for discovering new sites of interest thatmay play critical roles in the development and/or progression of cancer.There is also a lack of rapid and accurate methodologies for determiningthe methylation status of specific genes for use as diagnostic,treatment, and prognostic tools for cancer patients.

Hypomethylation has also been implicated as a mechanism responsible fortumor progression (Dunn, 2003). Several genes have been characterized asbeing hypomethylated in colon carcinoma and/or leukemia, includinggrowth hormone, c-myc, gamma globulin, gamma crystallin, alpha and betachorionic gonadotropin, insulin, proopiomelanocortin, platelet derivedgrowth factor, c-ha-ras, c-fos, bcl-2, erb-A1, and ornithinedecarboxylase. The majority of these genes are involved in growth andcell cycle regulation and it has been proposed that the loss ofmethylation in these genes contributes to unchecked cell proliferationin these and other cancer types.

While both hypermethylation and hypomethylation have been implicated inthe development and progression of several cancers, their specific roleshave not been fully elucidated. For instance, does hypermethylation oftumor suppressor genes lead to hypomethylation of cell cycle regulatorygenes leading to unchecked cellular proliferation? In order to answerthese and other important questions, rapid, accurate, and sensitivetechnologies for the analysis of DNA methylation patterns within normaland cancer cells are required.

Genome-Wide DNA Methylation Patterns

The analysis of global levels of DNA methylation has proven useful inthe study of cancer, disease, and ageing. Changes in global methylationlevels have been directly correlated with the development of severaltypes of cancer, including: lung, colon, hepatic, breast, and leukemia(Fruhwald and Plass, 2002). The measurement of global methylation levelshas been accomplished by several distinct technologies: Southernblotting, High Pressure Liquid Chromatography (HPLC), High PerformanceCapillary Electrophoresis (HPCE), MALDI mass spectrometry, and Chemicalor Enzymatic incorporation of radio-labeled methyl groups (Fraga andEsteller, 2002).

Southern blotting techniques involve traditional, two-dimensional gelelectrophoresis of DNA digested with a non-methylation sensitiverestriction endonuclease (first dimension), followed by a methylationsensitive restriction endonuclease (Fanning et al., 1985). Thisprocedure allows the differential resolution of banding patterns betweentwo samples to compare relative methylation patterns. HPLC and HPCEmethods both require the breakdown of DNA into the individualnucleotides which are then separated using either chromatography (HPLC)or electrophoresis (HPCE). For HPLC, the resulting methylcytosine andcytosine peaks can be resolved and quantified by comparison to knownstandards (Tawa et al., 1994; Ramsahoye, 2002). Although peaks can beidentified for HPCE, there are no current quantification protocols forquantifying methylcytosine at this time (Fraga et al., 2000). Both ofthese methods are hampered by the requirement for a large amount ofstarting material, 2.5 μg for HPLC and 1 μg for HPCE. Furthermore, thesemethods also require specialized, expensive equipment.

Recently, additional variations on the basic HPLC analysis method havebeen developed. These methods have combined HPLC techniques with primerextension and ion pair reverse phase (IP RP) HPLC (Matin et al., 2002),or electrospray ionization mass spectrometry (Friso et al., 2002). Bothof these methods have sought to improve on the accuracy and sensitivityof the previous HPLC technique. The IR RP HPLC method combines bisulfiteconversion of DNA with a primer extension reaction, followed by analysisof resulting products by HPLC.

The technique of matrix-assisted laser desorption/ionization (MALDI)mass spectrometry has also been utilized for the accurate quantificationof methylation in cancer samples (Tost et al., 2003).

Enzymatic and chemical labeling of methylcytosine residues have alsobeen used in order to quantify global methylation levels. The enzymaticmethods involve the addition of a radio-labeled methyl group tocytosine, resulting in an inverse correlation between incorporated labeland the amount of methylation in the sample (Duthie et al., 2000). Achemical method for labeling has also been developed based onfluorescent labeling of adenine and cytosine residues bychloracetaldehyde (Oakeley et al., 1999). This method relies onbisulfite conversion of non-methylated cytosines to uracil in order toallow the fluorescent labeling of only methylcytosine.

To study global methylation, Pogribny et al. (1999) have developed anassay based on the use of methylation-sensitive restrictionendonucleases HpaII, AciI, and BssHII that leave 5′ guanine overhangsafter DNA cleavage, with subsequent single radiolabeled nucleotideextension. The selective use of these enzymes was applied to screen foralterations of genome-wide methylation and CpG islands methylation,respectively. The extent of radioactive label incorporation was found tobe proportional to the number of unmethylated (cleaved) CpG sites.

In Situ Analysis of DNA Methylation

Another method for investigating genome wide levels of methylationinvolves methylcytosine specific antibodies (Miller et al., 1974). Thismethod also allows further investigations into levels of methylation ondifferent chromosomes and even different parts of a single chromosome(Barbin et al., 1994). Furthermore, in situ hybridization can beutilized to analyze the differential methylation patterns of adjacentcells in tissue sections.

Site-Specific DNA Methylation Analysis

Analysis of site-specific methylation patterns can be divided into twodistinct groups, bisulfite conversion methods and non-bisulfite basedmethods. The bisulfite conversion method relies on treatment of DNAsamples with sodium bisulfite which converts unmethylated cytosine touracil, while methylated cytosines are maintained (Furuichi et al.,1970). This conversion results in a change in the sequence of theoriginal DNA. Analysis of the sequence of the resulting DNA allows thedetermination of which cytosines in the DNA were methylated. There areseveral methodologies utilized for the analysis of bisulfite convertedDNA including sequencing, methylation-specific PCR, COBRA (COmbinedBisulfite Restriction Analysis), methylation-sensitive single nucleotideprimer extension, and methylation-sensitive single-strand conformationanalysis.

The major drawback to bisulfite conversion of DNA is that it results inup to 96% degradation of the DNA sample (Grunau et al., 2001). The harsheffect of bisulfite treatment, in combination with the need to convertall methylated cytosines, requires a substantial amount of input DNA inorder to obtain enough usable DNA following conversion. Furthermore, thehigh levels of degradation complicate the detection of differences inmethylation patterns in DNA samples from mixed cell populations, forexample cancer cells in a background of normal cells. Changing theincubation conditions in order to minimize DNA degradation can result inincomplete conversion and the identification of false positives.

Bisulfite DNA Conversion Methods for Methylation Analysis

The most direct method for analysis of bisulfite converted DNA is directsequencing (Frommer et al., 1992). Amplification of fragments ofinterest followed by sequencing will quickly and accurately identify allcytosines that were methylated, as all non-methylated cytosines willhave been converted to Uracil. One drawback to direct sequencing is thenecessity to design amplification and sequencing primers that are basedon all of the possible sequences depending on the level of methylation.The conversion of cytosine to uracil will alter the priming sequencesalong with the target sequences. Furthermore, sequencing is a laborintensive and time-consuming activity if one is investigating largenumbers of sequences and/or large numbers of samples.

Methylation-Specific PCR (MS-PCR) is the most commonly used techniquefor analysis of methylation. MS-PCR is utilized to determine themethylation status of specific cytosines following conversion ofunmethylated cytosines to uracil by bisulfite conversion (Herman et al.,1996). The methylation status of specific cytosines can be determined byutilizing primers that are specific for the cytosine of interest. Thedifferences in sequences following conversion allow different primersets to determine whether the initial sequence was methylated. Meltingcurve Methylation Specific PCR (McMS-PCR) replaced sequence analysis ofthe resulting PCR products, with the more efficient process of meltcurve analysis (Akey et al., 2002; Guldberg et al., 2002). Differencesin the melting temperature of the products are due to the sequencedifferences resulting from bisulfite conversion of methylated versusunmethylated DNA samples. Another method for analyzing MS-PCR productsusing melting characteristics involves the use of denaturinghigh-performance liquid chromatography (Baumer, 2002). In this method,MS-PCR is carried out under conditions that will amplify both alleles(converted and unconverted cytosines). The products of MS-PCR areanalyzed by HPLC under denaturing conditions, allowing the resolution ofdifferent products based on sequence differences due to bisulfiteconversion.

One version of MS-PCR, called MethyLight (Eads et al., 2000), involvesthe use of fluorescence-based real-time quantitative PCR to allow bothdetection and quantitation of the converted products in one step. Themajor drawback of these techniques is the necessity to design primersspecific for each methylation site that are based on the differentconverted sequence possibilities. An additional modification to theMethyLight protocol involves using an additional fluorescent probedirected against unconverted DNA. This protocol, ConLight-MSP, wasdeveloped to address the issue of overestimation of methylation due toincomplete conversion of DNA by bisulfite (Rand et al., 2002). A secondmethod aimed at addressing the problem of incomplete bisulfiteconversion is bisulfite conversion-specific Methylation-Specific PCR(BS-MSP) (Sasaki et al., 2003). In this technique, two rounds of PCR arecarried out following bisulfite conversion of DNA. In the first round,primers are utilized that do not contain CpG's, but do contain cytosinesat the 3′ position. Thus, only fully converted DNA will be amplified inthe first round of amplification. A second, traditional MSPamplification is subsequently carried out to amplify the CpG's ofinterest. This will result in a lower level of background amplificationof sites with incomplete conversion of DNA, and a more accuratedetermination of the level of methylation in the sample.

Other methods for site-specific methylation analysis include COBRA,Methylation-sensitive single nucleotide primer extension (MS-SNuPE), andmethylation-sensitive single-strand conformation analysis (MS-SSCA).COBRA combines the techniques of bisulfite conversion withmethylation-sensitive restriction endonuclease analysis (describedbelow) to enable highly specific, highly sensitive quantitation ofmethylation sites contained within recognition sites formethylation-sensitive restriction enzymes (Xiong and Laird, 1997).Melting curve combined bisulfite restriction analysis (McCOBRA) wasdeveloped to allow analysis of bisulfite converted DNA without gelelectrophoresis (Akey et al., 2002). In this procedure, bisulfiteconverted DNA is amplified by PCR with specific primer pairs surroundinga potential methylation site. The resulting PCR products are digestedwith a restriction site that will only recognize and cut DNA that wasoriginally methylated. Melt curve analysis will yield two peaks, basedon the size difference of the cut versus uncut DNA, and allow thedetermination of the methylation status of that site in the originalDNA. Another variation of COBRA, termed Pyro-sequencing methylationanalysis (PyroMethA) involves the use of the Pyrosequencing reaction todetermine methylation status in place of the restriction analysis usedin COBRA (Collela et al., 2003; Tost et al., 2003). MS-SNuPE combinesMS-PCR amplification of bisulfite converted DNA with single nucleotideextension of MS-PCR products to incorporate radio-labeled C (methylated)or T (unmethylated) that can be detected using a phosphoimager (Gonzalgoand Jones, 1997). The ratio of C/T incorporation will indicate the levelof methylation at a particular site. Finally, MS-SSCA utilizes bisulfiteconverted DNA with single-stranded conformational polymorphism (SSCP)analysis to detect sequence differences through changes in the migrationof the molecules during electrophoresis (Burri and Chaubert, 1999;Suzuki et al., 2000).

Another method for analyzing the methylation status of specific siteswas created based on changes in restriction endonuclease recognitionsites following bisulfite conversion of DNA (Sadri and Hornsby, 1996).In this procedure, DNA is bisulfite converted and a specific region ofinterest is amplified by PCR. Following amplification, the resultingproducts are digested with either a restriction endonuclease that willonly cleave the sequence generated by conversion of an unmethylated CpG,or a restriction endonuclease that will cleave the same site only if itwas originally methylated and not converted by bisulfite treatment.Comparison of the products of digestion will indicate the methylationstatus of the site of interest and, potentially, relative levels ofmethylation of the site from a mixed population of cells. This methodimproves on normal MSP by not relying on differences in PCRamplification between converted and non-converted DNA. However, thismethod is also susceptible to incomplete conversion of the starting DNA.Furthermore, this method is dependent on bisulfite conversion resultingin a different restriction endonuclease recognition site being createdby bisulfite conversion. The authors estimated that approximately 25% ofCpG sites would be able to be analyzed by this method, leaving themajority of CpG sites unanalyzed. A newly developed technique,HeavyMethyl, utilizes real-time PCR analysis of unconverted DNA(Cottrell et al., 2004). Specificity for methylated sites is achieved byusing a methylation sensitive oligonucleotide blocker. This blocker willonly bind to unmethylated DNA, blocking annealing of the primer andpreventing amplification. Methylated sequences will not bind the blockerand will be primed and extended, resulting in cleavage of the probe andfluorescent detection. The advantages of this system include loweredbackground, higher specificity of signal, and decreased requirement forstarting material due to the lack of a bisulfite conversion step.However, development of each assay will require the design andoptimization of 5 oligonucleotides: 2 primers, 2 blocking nucleotides,and a probe. This requirement will greatly increase the difficulty andcost of developing site-specific assays. Furthermore, small samples ofDNA will only yield enough material for a few assays and will not allowanalysis of large numbers of potential methylation sites.

All of the aforementioned methods that can be used to analyzebisulfite-converted DNA require several nanograms of converted DNA perassay and are thus impractical for genomewide methylation analysis. Toallow genomewide methylation analysis by these methods, techniques mustutilized that can efficiently amplify small quantities of converted DNA.

Non-Bisulfite Based Methods of Methylation Analysis

Non-bisulfite based methods for analysis of DNA methylation rely on theuse of methylation-sensitive and methylation-insensitive restrictionendonucleases (Cedar et al., 1979). Following digestion of sample DNAwith either methylation-sensitive or methylation-insensitive restrictionenzymes (ex. MspI and HpaII), the DNA can be analyzed by methods such asSouthern Blotting and PCR. Southern blot analysis involveselectrophoretic separation of the resulting DNA fragments andhybridization with a labeled probe adjacent to the CpG of interest. Ifthe hybridization signal from the methylation-sensitive andmethylation-insensitive digested DNA samples results in different sizebands, than the site of interest was methylated. In contrast, PCRanalysis involves amplification across the CpG of interest. The expectedband will only be observed in the methylation-sensitive digested sampleif the site of interest is methylated. The disadvantages of the Southernblotting assay is that specific probes must be developed for every siteof interest and large amounts of starting DNA (ex: 10 μg) are required.The PCR assay requires much lower amounts of DNA for each site ofinterest (ex: 1-10 ng), but necessitate the design and testing ofspecific primer pairs for every site of interest. Furthermore, althougheach individual assay requires only nanogram quantities of DNA, analysisof hundreds or even thousands of potential methylation sites stillinvolves μg quantities of DNA. The overall limitation of thesetechnologies is their dependence on the presence of amethylation-sensitive restriction site present at the CpG of interest.Thus, although these assays are relatively quick and simple, they cannotbe used to test all potential methylation sites. Furthermore, thesemethods can only be used for analysis of sites that have been previouslyidentified and have had detection assays designed for them, and they donot allow for the discovery of new sites of interest.

Ligation-mediated PCR (LM-PCR) was developed to increase the sensitivityof methylation analysis by restriction endonuclease digestion(Steigerwald et al., 1990). In this method, the methylation status ofspecific sites is determined. DNA is digested with amethylation-sensitive restriction endonuclease that will cleave a siteof interest, along with a methylation-sensitive restriction endonucleasethat will cut in fairly close proximity to the methylated site ofinterest. Following digestion, a primer extension reaction is performedusing a previously characterized primer that is upstream from bothdigestion sites. A linker sequence is ligated to the resulting end ofthe extended sequence. A second primer extension step is performed usinga primer based on the linker sequence, and PCR amplification isperformed using the linker sequence and a nested primer downstream fromthe primer used in the primary primer extension reaction. The productsof amplification are analyzed by gel electrophoresis. Two potentialbands are produced by this method: a full length amplimer indicatingmethylation of the target sequence, and a shorter amplicon indicating alack of methylation. A mixture of both products indicates that partialmethylation existed in the sample, and an estimation of the amount ofmethylation can be determined by comparison of the ratio of the twoproducts. This method greatly improved on the sensitivity of PCR-basedmethods of analysis, but is greatly hindered by the necessity ofcreating 2 primers for each loci of interest, and the requirement foranalyzing 1 specific site per reaction.

The technique of Differential Methylation Hybridization (DMH) has beenutilized to screen CpG island arrays to determine methylation status ofa large number of sites at a time (Huang et al., 1999). In thisprocedure, DNA is digested with a frequent cutting restrictionendonuclease to generate small DNA fragments. Linkers are ligated to theproducts of digestion and repetitive DNA is subtracted. The resultingmolecules are digested with a methylation-sensitive restrictionendonuclease. PCR of the digestion products with a primer complementaryto the linkers results in amplification of all molecules that containeither methylated restriction sites or no restriction sites. Theproducts of amplification are then hybridized to a CpG island arrayconsisting of clones containing multiple restriction endonuclease sitesfor the enzyme used to digest the DNA. Hybridization to a cloneindicates that the site was methylated in the starting DNA. This methodrequires the generation of a large number of clones for creation of thearray and is limited by the ability to amplify the products of theoriginal digestion. Many fragments will be either too large to beamplified, or be so small as to result in suppression of amplificationor poor hybridization to the array. Furthermore, there will be a highlevel of background of products that do not contain methylation sites ofinterest that will affect the signal to noise ratio of the arrayhybridization.

Yan et al., (2001) and Chen et al., (2003) have developed a closelyrelated method referred to as Methylation Target Arrays (MTA), derivedfrom the concept of tissue microarray, for simultaneous analysis of DNAhypermethylation in multiple samples. In MTA, target DNA is digestedwith four-base restriction endonucleases, such as MseI, Bfal, NlaIII, orTsp509I, known to restrict DNA into short fragments, but to retain CpGislands relatively intact. The GC-rich fragments are then isolatedthrough an affinity column containing methyl-binding MeCP2 protein.Linkers are ligated to the overhangs of the CpG island fragments and aredigested with methylation-sensitive restriction enzymes, BstUI andHpaII. Finally, the fragments are amplified with flanking primers. CpGsites that are methylated are protected from cleavage and are amplifiedin the process, whereas non-methylated CpG islands are lost torestriction. Initially, a microarray containing 7,776 short GC-rich tagstethered to glass slide surfaces was used to study 17 paired tissues ofbreast tumors and normal controls. Amplicons, representing differentialpools of methylated DNA fragments between tumors and normal controls,were co-hybridized to the microarray panel. Hypermethylation of multipleCpG island loci was then detected in a two-color fluorescence system.Hierarchical clustering segregated these tumors based on theirmethylation profiles and identified a group of CpG island loci thatcorresponds to the hormone-receptor status of breast cancer. A panel of468 MTA amplicons, representing the whole repertoire of methylated CpGislands in 93 breast tumors, 20 normal breast tissues, and 4 breastcancer cell lines, were arrayed on a nylon membrane for probehybridization. Hybridization was performed with PCR-generated probes for10 promoters, labeled with ³²P-dCTP. Positive hybridization signalsdetected in tumor amplicons, but not in normal amplicons, wereindicative of aberrant hypermethylation in tumor samples. This wasattributed to aberrant sites that were protected frommethylation-sensitive restriction digestion and were amplified by PCR intumor samples, while the same sites were restriction digested and couldnot be amplified in normal samples. Hypermethylation frequencies of the10 genes GPC3, RASSF1A, 3OST3B, HOXAS, uPA, WT1, BRCA1, DAPK1, and KLwere tested in breast tumors and cancer cell lines.

The aforementioned DMH and MTA technologies are described in U.S. Pat.No. 6,605,432, PCT WO03/087774A2, and U.S. Patent ApplicationUS20030129602A1 by Huang (see bellow). Drawbacks of these methods arethe lack of complete coverage of all regions of the genome during theinitial restriction digest, generation of false positive results due toincomplete cleavage by a methylation-sensitive restriction enzyme,inability to analyse nicked, degraded, or partially double-stranded DNAfrom body fluids, as well as lack of quantitation and relatively lowsensitivity. Thus, these techniques are limited to applications in whichlarge quantities of DNA are readily available and methylated DNArepresents high percentage of the total DNA. Therefore, a sensitivediagnostic method that is capable of amplifying all regions of thegenome and detect methylation when using samples containing only smallfraction of methylated DNA in a vast majority of non-methylated DNA isstill needed.

Several techniques have been developed in order to identify unknownmethylation hotspots, including restriction landmark genomic scanning(RLGS), methylation-sensitive representational difference analysis(MS-RDA), methylated CpG island amplification-representationaldifference analysis (MCA-RDA), methylation-sensitive arbitrarily primedPCR (MS-AP-PCR), methylation-spanning linker libraries (MSLL),differential methylation hybridization (DMH, see above),methylation-sensitive amplification polymorphism (MSAP), affinitycapture of CpG islands, and CpG island microarray analysis (see above).

RLGS involves the digestion of high molecular weight DNA by amethylation sensitive restriction endonuclease, such as Notl, thattargets CpG islands (Hayashizaki et al., 1993). The products ofdigestion are differentiated by two dimensional gel electrophoresisinvolving 2^(nd) and 3^(rd) digestions with non-methylation sensitiverestriction endonucleases (Rush and Plass, 2002). The pattern of bandingbetween two samples can be compared to determine changes in methylationstatus. Subsequently, these techniques have been expanded to includecloning of specific bands from the 2-D gel in order to identifymethylated sequences. Recently, computer based RLGS systems have beendeveloped to predict banding patterns based on digestion of genomic DNAwith methylation-sensitive restriction endonucleases (Masuyama et al.,2003; Rouillard et al., 2001; Akiyoshi et al., 2000). The drawbacks ofthese techniques include a requirement for a large amount of startingmaterial, the difficulty of resolving complex samples containing cellswith different methylation patterns, and the large amount of worknecessary to identify all of the bands of interest. Furthermore,although this technique is reproducible, sequence variations betweensamples can result in gain or loss of cleavage sites, resulting inchanges in the banding pattern that are not related to changes inmethylation.

Methylation-sensitive representational difference analysis (MS-RDA) wasdeveloped to determine differences in methylation status between controland cancer samples to allow the identification of methylated regions incancer (Ushihima et al., 1997; Kaneda et al., 2003). In this method, twoDNA samples (Tester and Driver) are digested with amethylation-sensitive restriction endonuclease. The resulting productsfrom each sample have an adaptor ligated to them and are amplified byPCR. Following amplification, the adaptors are removed and a secondadaptor is ligated to the 5′ end of the tester sample. The two samplesare mixed, with the driver in large excess compared to the tester.Denaturing and annealing steps result in the production of mostlydriver/driver or driver/tester molecules for sites that were methylatedin the driver and the tester DNA, and tester/tester molecules for sitesthat were methylated in only the tester DNA sample. The resulting 3′ends are filled in, producing molecules with the second adaptor at bothends only in the case of tester/tester hybridization. Amplification ofthe tester/tester hybrids by PCR using the second adaptor sequenceresults in isolation of those sites methylated only in the testersample. The enriched molecules can then be analyzed by a number oftechniques known in the art, including PCR, microarray hybridization,and sequencing. Although this protocol has been useful in theidentification of specific methylation differences between cancer andnormal samples, there are several limitations inherent in thismethodology. The limitations of this technology include the requirementfor two restriction endonuclease sites within close enough proximity toallow PCR amplification, but not so close as to result in suppression ofthe resulting products. Furthermore, RDA produces only enrichment ofsequences and does not completely select against sites that aremethylated as some tester/tester hybrids are formed even in the presenceof a large excess of driver.

Another related procedure, methylated CpG islandamplification-representational difference analysis (MCA-RDA), wasdeveloped to amplify and enrich methylated CpG islands present in thetester DNA (Toyota et al., 1999; Toyota and Issa, 2002). In this method,tester and driver are first digested with a methylation-sensitiverestriction endonuclease that results in blunt ends (ex: Sma I).Subsequently the methylated restriction sites are cleaved with anon-methylation-sensitive isoschizomer of the first endonuclease (ex:Xma I) that produces overhanging ends. Adaptors are ligated to theresulting overhanging ends, but not to the blunt ends. The moleculesthat contain an adaptor at both ends are amplified by PCR and RDA isperformed as described above to select for those molecules only presentin the tester population. This protocol improves on MS-RDA by amplifyingentire CpG islands. However, this method is even more limited thanMS-RDA in that appropriate isoschizomers for methylated restrictionsites are required to produce the libraries.

The procedure of methylation-sensitive arbitrarily primed PCR(MS-AP-PCR) was developed in order to identify genomic regions withaltered patterns of methylation (Gonzalgo et al., 1997). In this method,DNA is digested with methylation sensitive and methylation insensitiverestriction endonucleases. Following digestion, arbitrarily primed PCRis performed using short primers under low stringency conditions for acouple of cycles, followed by high-stringency amplification. Theproducts are separated by high-resolution polyacrilimide gelelectrophoresis and band differences between control and test samplesare isolated and sequenced. The banding patterns observed duringelectrophoresis are fairly reproducible between reactions due to thefact that a specific primer sequence is utilized for each reaction.Random primed PCR is different in that it utilizes degenerate primersthat contain a large number of primer sequences.

The identification of epigenetic boundaries was determined in corn bycreating methylation-spanning linker libraries (MSLL) (Yuan et al.,2002). In this method, genomic DNA is digested with amethylation-sensitive restriction endonuclease and ligated into BACvectors. The resulting libraries were end-sequenced and analyzed formethylated DNA sites. This technique allows the determination ofmethylated sequences without a priori knowledge, and allows the improvedcloning and sequencing of genomic regions that are resistant to shotguncloning. However, MSLL is a low-throughput technology that is limited bythe constraints of sequencing large numbers of clones that will containmany repeats of the same insertion sequences.

Methylation-sensitive amplification polymorphism (MSAP) has beenutilized to determine changes in methylation patterns in banana plants(Peraze-Echeverria et al., 2001). In this technique, a double digest isperformed on two aliquots of DNA. There is a common methylationinsensitive restriction endonucleases utilized in both digestions. Thesecond restriction endonuclease is methylation sensitive in one digest(ex. Hpa II), and a methylation insensitive isoschizomer (ex. Msp I) inthe other digest. The resulting products of digestion have adaptorsligated to them and are amplified under various selective conditions.The amplicons are then subjected to gel electrophoresis and detection.Comparisons are made between the samples digested with methylationsensitive and methylation insensitive restriction endonucleases betweensamples. Changes in the banding patterns are recorded as changes inmethylation patterns in different samples. This technique allows theamplification and analysis of specific sites of methylation, but isdependent on the existence of methylation sensitive and methylationinsensitive restriction endonuclease isoschizomers.

The Methylation-Dependent Restriction Endonuclease McrBC

McrBC is an E. coli protein complex that cleaves DNA based onrecognition of RmC sequences that are separated by 40 to 3000 bp(Sutherland et al., 1992; Stewart and Raliegh, 1998). McrBC inducedcleavage occurs by DNA translocation following binding of the DNA at theRmC recognition site, resulting in interaction of two McrBC substrates(Dryden et al., 2001). Thus, cleavage by McrBC does not always result incleavage at the same location between methylation sites and differentpatterns of cleavage can be observed in DNA with multiple methylationsites at varying distances from each other, depending on the number anddensity of methylated sites. The requirement of McrBC for the twomethylation recognition sites to occur on the same strand (cis) or onopposite strands (trans) is not clear. There has been one report ofsuccessful cleavage of both cis methylated DNA and trans methylated DNA(Sutherland et al., 1992), but further clarification of this issue isrequired.

There is an example of McrBC being used to identify methylated regionsof interest (PCT WO 03/035860). This method involves the degradation oftwo sources of DNA. One sample is degraded with an enzyme such as McrBC,and one sample is degraded with a methylation-sensitive restrictionendonuclease. The hybridization of the two samples provides a screen todetermine which samples were cut with McrBC. The hybridized products areisolated and the resulting molecules are sequenced to identify themethylated regions of interest. While this protocol is aimed atuniversal detection of global methylation patterns through use of McrBC,it involves a subtractive procedure and does not allow the amplificationof the products following subtraction and isolation.

Other uses for McrBC that have been reported include using McrBCexpressing bacterial strains to digest plasmids containing genomic DNAin order to subtract repetitive elements (i.e., heavily methylated) inorder to isolate genomic regions of interest from plants (U.S. PatentApplication US20010046669). The specific steps involve fragmenting DNA,inserting the DNA fragments into a suitable vector, and then insertingthe library DNA into McrBC expressing bacteria. The bacteria will cleaveany vector sequences that contain sequences with multiple methylatedgenomic inserts. Thus, only non-methylated inserts will contain intactplasmids that will grow. The resulting colonies contain molecules fromregions of hypomethylation. This method was utilized to increase thecloning of gene-coding regions from plant genomes.

Methylation patterns in simple genomes have been investigated by use ofMcrBC cleavage (Badal et al., 2003). In this work, the methylationpatterns of HPV were investigated in cervical cancer. Viral genomic DNAwas digested by McrBC and the resulting fragments underwent bisulfitesequencing. The small size of the HPV genome (7900 bp) allows repetitivesequencing efforts to quickly identify all sequences and methylationsites within the HPV genome. This methodology has limited application tohuman DNA due to the large size of the human genome. Furthermore, thereare no mechanisms for amplifying or selecting molecules based on theirmethylation status.

Patents and Patent Applications Related to Methylation Detection andAnalysis

U.S. Pat. No. 6,214,556 B1 and corresponding PCT WO99/28498 issued toOlek et al. describe a method of methylation analysis in which DNA isfragmented by means of mechanical shearing or digestion with arestriction endonuclease and then treated with sodium bisulfite toconvert non-methylated cytosine to uracil. Converted DNA is amplified bytwo different methods. In the first method, double-stranded adaptormolecules of known sequence are ligated to the DNA fragments beforebisulfite conversion and then amplified by polymerization using primerscomplementary to the adaptor sequences present after the bisulfitetreatment. In some versions of the method, the primers used foramplification can also contain one to four bases long 3′-extensions thatgo into the unknown sequence and that represent different basepermutations. In the second method representing a modification of theDOP-PCR technique, primers that contain a constant 5′ region and adegenerate 3′ region are used to amplify converted DNA fragments orsubsets of them. In both methods of amplification two types of sequencesare used for amplification. Type one sequences completely lack cytosineor only have cytosine in the context of the CpG dinucleotide, and typetwo sequences completely lack guanine or only have guanine in thecontext of the CpG dinucleotide These two types of sequences are used tospecifically target strands of DNA that are rich in guanine or rich incytosine respectively after bisulfite conversion. Overall the quantityof the remaining cytosines on the G-rich strand or the quantity ofremaining guanines on the C-rich strand is determined by hybridizationor by polymerization. In one version of the method, the target DNA iscleaved with methylation-sensitive restriction enzyme prior to bisulfiteconversion for the obvious reason of reducing the amount ofnon-methylated DNA. The method described above suffers from the inherentdrawbacks of all techniques based on bisulfite conversion, namelyreduced sensitivity due to significant loss of DNA during the process ofbisulfite conversion that compromises the analysis of clinical samplescontaining only small percentage of methylated DNA in a vast majority ofnon-methylated DNA, as well as problems implementing the method to assaymethylation in clinical settings due to multiple and complex preparationsteps.

U.S. Patent Applications 20030099997A1 and 20030232371A1 andcorresponding PCT WO 03/035860A1 by Bestor disclose methods fordetection of methylated promoters and gene identification based ondifferential hybridization of a test and control DNA samples, one ofwhich has been treated with a methylation-dependent endonuclease McrBCand the other one by a methylation-sensitive restriction endonuclease(HpaII, HhaI, MaeII, BstU, or AciI). The two samples are modified suchas to prevent formation of duplexes between homologous DNA fragments.The samples from the two sources are then denatured and hybridized toform hetero-duplexes. The modification of at least one of the samples isperformed in such a way as to facilitate the isolation of the resultinghetero-duplexes that are then analyzed by sequencing and the positionsof methylated cytosines are determined. Although this technology canaccurately determine the methylation status of a gene promoter andallows for the discovery of new sites of interest, it suffers fromlimitations such as the requirement for significant amount of startingDNA material, inability to process multiple samples simultaneously, anddependence on the presence of a methylation-sensitive restriction sitepresent at the CpG of interest.

PCT WO 03/027259A2 by Wang describes a method for analysis of themethylation status of test and control DNA samples based on cleavage ofthe DNA with methylation sensitive restriction enzyme(s), ligation oflinkers to the generated overhangs, PCR amplification, and labeling ofthe fragments receiving ligated linkers, hybridization of the fragmentson solid support containing immobilized target DNA sequences, andcomparison of the signals produced after hybridization of the test andcontrol samples, thereby detecting the extent of methylation of one ormore regions of DNA. This is limited by dependence on the presence of amethylation-sensitive restriction site present at the CpG site(s) ofinterest and that this procedure can only be used for analysis of sitesthat have been previously identified. Thus, it does not allow for thediscovery of new methylation sites of interest.

PCT WO 03/025215A1 by Carrol et al. describes a method for analysis ofDNA methylation patterns by digesting DNA with a methylation-sensitiverestriction enzyme followed by amplification with primers annealing tothe non-cleaved form of the recognition sequence. The results of theamplification reaction are then compared to an identical reaction run inparallel using the same primers to amplify another aliquot of the DNAsample that has not been cleaved with restriction enzyme. This method islimited to the availability of suitable restriction sites and requiressignificant amounts of input DNA for analysis of multiple restrictionsites. In addition, it depends on the complicated design and empiricaltesting of primers for each of thousands of potentially methylated sitesrequired for successful profiling, each with very high GC content.

PCT WO 03/080862A1 to Berlin discloses a method and devices foramplification of nucleic acids retaining the methylation pattern of theoriginal template. The method comprises denaturing of genomic DNA,annealing of specific primers in an extension/polymerization reactionwith DNA polymerase, and incubation of the resulting double-stranded DNAwith a methyltransferase in the presence of a labeled methyl group donorto restore the methylation pattern encoded in the original template. Thedescribed steps are repeated several times, resulting in linearamplification that retains the methylation status of the target DNA.Amplified DNA is then digested by a methylation-sensitive restrictionenzyme or subjected to bisulfate conversion, and the resulting productsare analyzed by methods capable of retrieving the methylationinformation. While this method can amplify DNA regionally whileretaining the methylation information of pre-designed sites,amplification of DNA in linear mode is a slow and inefficient process,as opposed to exponential amplification. Furthermore, the amount ofinput DNA required for the procedure is still significant. In addition,this method is limited to regions for which prior knowledge ofmethylation is known. Thus, it cannot be applied for genome-widescreening of methylation patterns.

U.S. Pat. No. 6,300,071B1 issued to Vuylsteke et al. describes a methodfor detecting DNA methylation using the technique of Amplified FragmentLength Polymorphisms (AFLP). A test and a control DNA sample aredigested with one or more specific restriction endonucleases to fragmentDNA into series of restriction fragments. The resulting restrictionfragments are ligated with one or more double-stranded syntheticoligonucleotide adaptors. A combination of methylation-sensitive andmethylation-insensitive restriction enzymes is used to produceamplifiable fragments that originate from either metylated or fromnon-methylated DNA. A combination of primers that a complementary tospecific promoter sequences and primers complementary to adaptorsequences is used for PCR amplification and the resulting fragments areanalysed by gel electrophoresis for restriction patterns. This methodcan be used for simultaneous analysis of metylation at multiplepromoters but requires prior knowledge of sequences, empirical testingof multiple primers for compatibility and has limited application forclinical diagnostics.

Patent US 2005/0009059A1 ussued to Shapero et al. provides a method fordetermining if a cytosine in a target DNA sequence is methylated by thesteps of: fragmentation with restriction enzyme, ligation of adouble-stranded adaptor with a common priming sequence, conversion ofnon-methylated cytosines to uracils by treatment with sodium bisulfite,and hybridizing a capture probe comprising a second common sequence, atag sequence, a recognition sequence for Type IIS restriction enzyme,and a region that is complementary to a region of the target sequence 3′of a cytosine. The capture probe is extended and amplified with firstand second common sequence primers to generate double-stranded extendedcapture probe that is then digested with Type IIS restriction enzyme.The resulting fragments are extended by one base with a labelednucleotide and analyzed using an array of oligonucleotide probes. Asother methods in the art based on conversion with sodium bisulfite themethod described in this patent is limited to using only relativelylarge amounts of input DNA and requires design of complexoligonucleotide probes that are difficult to make compatible in amultiplex reaction.

U.S. Pat. No. 6,605,432, PCT WO03/087774 A2, and U.S. Patent ApplicationUS20030129602A1 by Huang describe the previously discussed DifferentialMethylation Hybridization (DMH) and Methylation Target Arrays (MTA)technologies (see Yan et al., 2001, Chen et al., 2003, and Huang et al.,1999). One to two micrograms of genomic DNA isolated from tumor orcontrol samples are digested overnight with Mse I, a four-baserestriction enzyme that cuts frequently in the rest of the genome butless frequently in CpG islands leaving promoter sites relatively intact.Digested products are purified and ligated to double-stranded linker ofknown sequence. Ligated DNA fragments are then purified and digestedovernight with the methylation-sensitive restriction enzyme BstUI. Afterpurification and buffer exchange the samples are digested againovernight with another methylation-sensitive restriction enzyme, HpaI.Samples are amplified by PCR using primer complementary to the knownlinker sequence. The resulting products are labeled and hybridized tomicroarrays comprising CpG island clones or other CpG-rich genomicprobes.

The methods described in these patents require microgram quantities ofDNA and involve multiple steps including 3 overnight digestions and 3purification steps. They also suffer from additional drawbacks such asthe lack of complete coverage of all regions of the genome during theinitial restriction digest. Regions with low density of cleavage siteswill not be amplified and their methylation status could not bedetermined using this technology. Incomplete cleavage bymethylation-sensitive restriction enzyme will produce false positiveresults. Also, if the DNA source is nicked or degraded or only partiallydouble-stranded as is often the case with DNA in blood circulation orother body fluids, cleavage with restriction enzyme will be inefficientand the method will perform poorly. In addition, the method of detectionby microarray hybridization employed in these techniques is notquantitative and has limited dynamic range and low sensitivity. Thus,the methods described in these patents are limited to applications inwhich large quantities of DNA are readily available and methylated DNArepresents high percentage of the total DNA.

The aforementioned methods in the art that employ adaptor ligation toDNA fragments are suitable for high molecular weight DNA samples and forpartially degraded DNA but not for circulating, cell-free DNA samplesfrom serum, plasma, and urine, which are heavily degraded and comprisedsubstantially of mono-, di-, and tri-nucleosomal sized fragments shorterthan 500 bp. First, a 4-bp recognition sequence restriction enzyme onlycleaves on average every 256 base pairs, so methods that rely on suchcleavage prior to adaptor ligation will not be applicable to anymononucleosomal sized fragments and to only a minority of dinucleosomalsized fragments. Second, there are no descriptions in the art forconverting heavily damaged DNA containing nicks or single-strandedgapped regions into amplifiable molecules that retain methylationinformation. These limitations of the art preclude effective methylationanalysis of DNA from non-invasive clinical sources such as serum,plasma, and urine, since a majority of the DNA may remain in anunamplifiable form. Thus, there exists a need for methods that canamplify substantially all the DNA from such sources to increase thesensitivity of methylation assays and to reduce the quantity of such DNArequired for analysis. These novel methods will be of particularimportance for diagnostic applications, where methylated markersindicative of a condition may exist only as a minor (<1%) fractionwithin the samples.

SUMMARY OF THE INVENTION

The present invention relates to novel methods and compositions fordetermining and analyzing methylation of a DNA molecule by preparingplurality of fragments using restriction enzymes that differentiatebettween methylated and non-methylated regions, incorporating a knownsequence at the end of said DNA fragments, amplifying said DNA fragmentsand determining the methylation status of one or more regions in theoriginal DNA molecule. In a general aspect of the invention, the methodschange the ratio of methylated to non-methylated DNA in a plurality ofDNA molecules, such as by eliminating nonmethylated regions andretaining methylated regions, and in further aspects this difference isamplified and/or quantitated. In other words, there may be eliminationor substantial reduction of background material, which may be consideredthe nonmethylated fraction in a plurality of DNA molecules, such thatthere is a change in the ratio of methylation. Such an enrichment may beat least 1000× compared to the original plurality of DNA molecules, forexample.

In particular embodiments, the present invention regards the preparationand amplification of special Methylome DNA libraries and subsequentidentification of specific DNA sequences that are either hypermethylatedor hypomethylated. In comparison to the whole genome libraries (see, forexample, U.S. patent application Ser. Nos. 10/797,333, filed Mar. 8,2004, published as U.S. Patent Application Publication No.: 2004/0209299and is now abandoned and Ser. No. 10/795,667, filed Mar. 8, 2004, nowU.S. Pat. No. 7,718,403 and incorporated by reference herein in itsentirety), the Methylome libraries are characterized by a selectivedepletion or even complete elimination of sequences corresponding tothose originally non-methylated CpG-rich genomic regions, or by asubstantial enrichment of the originally methylated CpG-rich genomicregions, or a combination thereof. In some embodiments, the Methylomelibraries are created through cleavage with at least onemethylation-sensitive restriction enzyme. In specific embodiments, theMethylome libraries are created through cleavage with a mixture of twoor more, such as five or more, methylation-sensitive restrictionenzymes. In other embodiments, the Methylome libraries are createdthrough cleavage with one or more methylation-specific enzymes, such asthe methylation-dependent cleavage enzyme McrBC, for example. In aseparate embodiment, the Methylome libraries are created by cleavagewith enzyme McrBC and a mixture of methylation-sensitive restrictionenzymes. In a particular embodiment, the DNA molecule or molecules isaltered differentially, and the alteration may be any kind ofalteration, but in exemplary embodiments it comprises cleavage and/orbisulfite conversion.

The DNA molecules of the present invention for which the methods areemployed such that a differential characteristic, for example,methylation, is determined may be of any kind, although in a particularembodiment of the invention the DNA molecule is damaged DNA, such as DNAthat results from apoptotic degradation, for example. That is, uponapoptosis of a cell, the DNA is released from the cell and, in specificembodiments, ultimately enters the blood or urine, for example. Thus,the DNA may be considered as circulating within the body and may evenpass the kidney barrier. The DNA produced by apoptosis may be fragmentedin between nucleosomes, such as being digested mononucleosomally (with afragmented size of about 200 nt), dinucleosomally (with a fragmentedsize of about 400 nt), and so forth. In fact, subjecting theapoptotic-produced fragmented DNA to gel electrophoresis often producesa banding pattern, as opposed to a smear expected for DNA that israndomly fragmented, for example. In further specific embodiments, theapoptotic-produced fragmented DNA further comprises nicks and/or gaps inthe DNA fragments. Thus, in particular the DNA molecules for the methodsherein may be referred to as substantially fragmented and/or cell-freeDNA, and in specific aspects the majority of the molecules are less thanabout 1 kb in size.. Methods of the present invention may employrelatively non-invasive methods to collect samples, such as by voidedurine or intravenous blood collection, for example. Thus, although inparticular embodiments the DNA molecules of the present invention arenaturally produced in vivo, in alternative embodiments the DNA moleculesof the present invention may be artificially fragmented, such as bynucleases, for example.

In particular aspects of the invention, information regarding themethylation status of one or more specific sequences is obtained byanalyzing at least part of one or more DNA molecules, which may bereferred to as a library, such as a Methylome amplification library. Forexample, a nucleic acid molecule, such as genomic DNA, is digested witha restriction enzyme that cleaves DNA based on methylated CpG, such asMcrBC, or it is digested with one or more, such as a mixture of severalrestriction enzymes unable to cleave sites having a methylated CpG. Theresulting DNA fragments are incorporated into a library and selectivelyamplified.

In some embodiments, part or all of a particular group of 11methylation-sensitive restriction endonucleases, specifically, Aci I,Bst UI, Hha I, HinP1, Hpa II, Hpy 99I, Ava I, Bce AI, Bsa HI, Bsi E1,and Hga I, that have 4-5 base pair recognition sites with at least oneCpG dinucleotide, and that have the characteristic of being unable todigest recognition sites having a methylated CpG, may be used toselectively cleave unmethylated CpG regions within DNA prior to, orafter, in another embodiment, library preparation. The spatialdistribution of recognition sites for these particular nucleases in thehuman genome closely follows the distribution of the CpG dinucleotides,with their density being very high in the CpG-rich regions (CpGislands). As a result, non-methylated CpG-rich regions, such as those ofgene promoters in normal cells, are susceptible to enzymatic cleavageand digested to very short fragments. Methylated CpG regions, such asthose that become hypermethylated in some gene promoters of cancercells, resist cleavage and remain intact.

In other embodiments, originally fragmented DNA (cell-free DNA in bloodand urine, or enzymatically, chemically and/or mechanically cut DNA) isconverted into a double stranded DNA library first and then digestedwith a mixture of several restriction enzymes unable to digest siteshaving a methylated CpG, or with a restriction enzyme that digests basedon methylated CpG, such as McrBC. Libraries are generated by methodsemployed to facilitate subsequent amplification, and in some embodimentsthe amplification is global, whereas in other embodiments theamplification may be targeted.

The use of multiple methylation-sensitive restriction enzymes for DNA orlibrary cleavage is beneficial to the efficient depletion ofnon-methylated regions from the Methylome library. Incomplete cleavageresulting from sources other than methylation specific cleavageprotection may be detrimental to the preparation and analysis ofMethylome libraries. Methylome template DNA, such as where themethylated fraction may constitute less then about 0.1% of total DNA,(such as serum and urine DNA from cancer patients, for example),requires efficient cleavage to maximize sensitivity.

In specific embodiments, the invention concerns determining methylationinformation from a DNA molecule, such as genomic DNA or even asubstantially complete genome, by obtaining one or more DNA molecules,cleaving the DNA molecule(s) differentially based on methylation status,generating a library of the cleaved fragments, and analyzing theamplified cleaved fragments.

The generation of Methylome libraries utilized herein may proceed by anymethod in the art. In specific embodiments, though, the generation oflibraries occurs by particular methods. In a first exemplary method, theDNA that is first cleaved by a mixture of multiple restriction enzymessensitive to methylation is denatured, and is further subjected to aplurality of primers to form a nucleic acid molecule/primer mixture,wherein the primers comprise nucleic acid sequence that is substantiallynon-self-complementary and substantially non-complementary to otherprimers in the plurality, wherein the sequence comprises, in a 5′ to 3′orientation a constant region and a variable region; and then subjectingthe nucleic acid molecule/primer mixture to a DNA polymerase, underconditions wherein the subjecting steps generate a plurality ofmolecules comprising the constant region at each end.

A skilled artisan recognizes that the characteristics of the librarygenerated by the first exemplary method utilizes sequence that issubstantially non-self-complementary and substantially non-complementaryto other primers in the plurality and facilitates not only librarygeneration but subsequent amplification steps. A skilled artisan alsorecognizes that there is an expected depletion of non-methylatedCpG-rich DNA regions that were converted to very short size duringmultiple restriction enzyme cleavage prior to generation of thislibrary. Very short DNA fragments are not efficient substrates for thisparticular described amplification method and will be lost duringlibrary preparation and amplification. There may also be an exclusion ofsequence surrounding at least one or a group of several known cleavagesites, such as exclusion of sequence surrounding at least part of atleast one promoter, such as a promoter involved in regulation of cellgrowth, for example tumor suppressors and/or oncogenes. There may alsobe exclusion of sequence surrounding at least part of at least one CpGisland, which may in fact be comprised of at least part of a promoter.

In a specific embodiment, the invention introduces a method ofenrichment of methylated sequences within the library. For the abovedescribed method, following amplification of the cleaved DNA fragments,there may be generation of a secondary library. For example, the methodmay further comprise the steps of cleaving the amplified DNA with one ofthe methylation-sensitive enzymes used in the original librarypreparation to produce cleaved products; ligating a second adaptor tothe ends of the cleaved products; amplifying at least some of the secondadaptor-ligated cleaved products; and analyzing the amplified secondadaptor-ligated cleaved products. A skilled artisan recognizes thatamplification of a secondary library would result in a substantialenrichment for originally methylated CpG-rich DNA regions because only asmall fraction of DNA amplicons from the first amplified library wouldharbor at least two corresponding CpG-containing restriction sitesnecessary for the generation of a secondary library.

In a second exemplary method of library generation, there may beattachment of an adaptor, the adaptor having a nonblocked 3′ end, to theends of the original or “polished” DNA fragments to produceadaptor-linked fragments, wherein the 5′ end of the DNA fragment isattached to the nonblocked 3′ end of the adaptor, leaving a nick sitebetween the juxtaposed 3′ end of the DNA fragment and the 5′ end of theadaptor; and extending the 3′ end of the DNA fragment from the nick siteincorporating the adaptor sequence into the DNA strand opposite theadaptor-attached DNA strand, followed by library amplification using PCRwith a universal primer complementary to at least a portion of theattached adaptor sequence. A skilled artisan recognizes that nucleasecleavage is not required for adaptor attachment and that using apolymerase for “polishing” and a ligase for attachment may result innicks and/or gaps being repaired within DNA fragments. In this method,cleavage with a mix of multiple restriction endonucleases can beperformed; (A) before adaptor attachment, (B) immediately after adaptorattachment, or (C) after adaptor attachment and extension of the 3′ end.A skilled artisan recognizes that there is an expected depletion ofnon-methylated CpG-rich DNA fragments during amplification in cases (B)and (C) resulting from the high probability of cleaving thecorresponding amplicons at least once with a mix of multiple restrictionenzymes. A skilled artisan recognizes that in case (A) cleavage beforethe library synthesis may result in very short library amplicons fornon-methylated CpG-rich regions that would be lost during amplificationby a PCR suppression mechanism.

For this exemplary method, amplification may be followed by cleavage ofDNA fragments, thereby selecting a subset of amplicons or secondarylibrary. For example, the method may further comprise the steps ofcleaving the amplified library with the same methylation-sensitiveenzyme as used in the original library preparation to produce cleavedproducts; ligating a second adaptor to the ends of the cleaved products;amplifying at least some of the second adaptor-ligated cleaved products;and analyzing the amplified second adaptor-ligated cleaved products. Askilled artisan recognizes that through generation of the libraries bythe second method, the cleavage site itself and its integrity is lost,although the adjacent sequences are preserved. A skilled artisanrecognizes that amplification of this type of secondary library wouldresult in a substantial enrichment of the originally methylated CpG-richDNA because only a small fraction of DNA amplicons from the firstamplified library would harbor at least two corresponding CpG-containingrestriction sites necessary for the generation of a secondary library.

In a third exemplary method of library generation, there may be aone-step multi-enzyme reaction that simultaneously involves DNA, DNApolymerase, DNA ligase, a special hairpin oligonucleotide, a mix ofmethylation-sensitive restriction enzymes, and a specified enzymecapable of processing a hairpin oligonucleotide before or after itsattachment to DNA. The library synthesis reaction proceeds throughsimultaneous (a) generation of blunt ends at DNA termini and hairpinadaptor; (b) creation of a non-replicable region within the loop of thehairpin oligonucleotide; (c) ligation of the hairpin oligonucleotide tothe ends of “polished” DNA fragments to produce adaptor-linkedfragments, wherein the 5′ end of the DNA fragment is attached to thenonblocked 3′ end of the hairpin adaptor, leaving a nick site betweenthe juxtaposed 3′ end of the DNA and a 5′ end of the adaptor; (d)extension of the 3′ end of the DNA fragment from the nick site to thenon-replicable region within the hairpin oligonucleotide and; (e)cleavage of DNA fragments and continuously generated library ampliconswith several methylation-sensitive restriction endonucleases. TheMethylome library synthesis is followed by library amplification usingPCR and universal primer. A skilled artisan recognizes that there is anexpected depletion of non-methylated CpG-rich DNA fragments due to thehigh probability of cleaving of amplicons synthesized at the early stageof the one-step reaction. A skilled artisan also recognizes that thevery short library amplicons that can be generated later in asingle-step process (as a result of multiple cleavage withinnon-methylated CpG-rich genomic regions and hairpin adaptor ligation)will be lost during amplification by a PCR suppression mechanism.Finally, a skilled artisan recognizes that nuclease cleavage is notrequired for adaptor attachment and that using a polymerase for“polishing” and a ligase for attachment may result in nicks and/or gapsbeing repaired within DNA fragments.

In a specific embodiment, the multiple restriction cleavage is performedseparately, such as after the one-step adaptor attachment processdescribed above. The Methylome library synthesis is followed by libraryamplification using PCR and universal primer.

Methylation libraries utilized herein can be further enriched forCpG-rich regions by implementing a thermo-enrichment step before,during, and/or after the Methylome library preparation andamplification. Library thermo-enrichment is based on differentialresistance of double stranded DNA molecules with high GC-base content tostrand dissociation at high temperature. The enrichment may be coupledwith enzymatic selection for double-stranded DNA molecules. A skilledartisan recognizes that fragment selection and library enrichment levelmay be adjusted for different GC-base composition by controlledincubation of temperature and time and strongly depend on factors suchas DNA fragment size, pH, concentration of monovalent and divalent ions,and the presence or absence of effective concentrations of additivesthat can alter the melting temperature of a double stranded DNAmolecule, such as dimethylsulfoxide or formamide, for example. In aspecific embodiment, the temperature employed is the temperature thatcauses denaturation of a specific fraction of the DNA. In furtherspecific embodiments, the temperature is such that about 50% to about99% of the DNA molecules are denatured.

In one embodiment, Methylome library thermo-enrichment is achieved byfirst “polishing” DNA fragment ends with, for example, T4 DNApolymerase, then briefly heating blunt end DNA fragments at sub-meltingtemperature (˜90° C.) and then performing adaptor ligation, 3′ endextension, multiple methylation-sensitive restriction enzyme cleavage,and PCR amplification. A skilled artisan recognizes that in this caseonly a small fraction of all DNA fragments can be converted into alibrary and amplified, specifically, such as only GC-rich DNA fragmentsthat do not undergo complete denaturation upon heating and return tonative double strand conformations necessary for efficient adaptorattachment, cleavage, and subsequent 3′ end extension.

In another embodiment, Methylome library thermo-enrichment is achievedby heating DNA fragments at sub-melting temperature (˜90° C.) afterpolishing and adaptor ligation, but before the 3′ end extension with T4DNA polymerase and multiple methylation-sensitive restriction enzymecleavage and PCR amplification. A skilled artisan recognizes that inthis case only a small fraction of all DNA fragments can be convertedinto a library and amplified, specifically, such as only GC-rich DNAfragments that survive heating and retain a double stranded conformationnecessary for efficient extension and library synthesis completion.

In another embodiment, Methylome library thermo-enrichment is performedafter library synthesis or even after library synthesis andamplification. In this case, heating of libarary amplicons atsub-melting temperature (˜90° C.) is followed by incubation with one ormore single-strand specific nucleases such as S1 or Mung Bean nuclease,purification of the sample, and re-amplification of the selectedamplicon fraction that proved resistant to single strand specificnuclease digestion. A skilled artisan recognizes that in this case onlya fraction of the library, specifically the most stable GC-richmolecules, can retain a double stranded structure, survive nuclease (S1and/or Mung Bean) treatment, and therefore remain competent forre-amplification.

In one embodiment of the invention, bisulfate-treated DNA is furthersubjected to a plurality of primers to form a nucleic acidmolecule/primer mixture, wherein the primers comprise nucleic acidsequence that is substantially non-self-complementary and substantiallynon-complementary to other primers in the plurality, wherein thesequence comprises, in a 5′ to 3′ orientation a constant region and avariable region; and then subjecting the bisulfate-converted nucleicacid molecule/primer mixture to a DNA polymerase, under conditionswherein the subjecting steps generate a plurality of moleculescomprising the constant region at each end. The synthesizedbisulfate-converted DNA library is then amplified by PCR with universalprimer and analyzed.

In another specific embodiment, the bisulfate conversion occurs afterattaching adaptors to DNA fragments generated by enzymaticfragmentation, such as with nuclease, chemical fragmentation, or bymechanical fragmentation. The adaptor sequences can be designed to beresistant to bisulfite treatment so that amplification of thebisulfite-converted DNA library can be performed using the same primersequences.

In another embodiment, a promoter-depleted bisulfite-converted DNAlibrary may be synthesized by the attachment of adaptor and by thedigestion with multiple methylation-sensitive restriction enzymes,followed by bisulfite conversion and amplification by PCR with universalprimer for analysis. A skilled artisan realizes that such a librarywould be substantially depleted of originally non-methylated CpG-richpromoter DNA regions and can be especially useful for methylationanalysis of DNA with low amounts of methylated DNA (such as cell-freeblood and urine DNA from individuals with cancer, for example). Askilled artisan realizes that all previously described variations of theadaptor-mediated method (including the one-step hairpin oligonucleotidemethod) can be applied to create a promoter-depleted bisulfite-convertedDNA library.

In particular aspects of the invention, the ends of the cleavedfragments further comprise a particular sequence, structure (such as anoverhang), or both that may be generated during library generation. Inspecific embodiments, the particular sequence, structure, or both may beadded following library generation. The particular sequence and/orstructure is preferably known, and in some embodiments the ends of thecleaved fragments of the library comprise substantially the samesequence, structure, or both. Furthermore, in amplification steps thisparticular sequence may be targeted, such as with a complementaryprimer.

In other embodiments, the library that is generated, including one thatmay have been amplified, is analyzed such that the one or morecharacteristics of the original DNA molecule may be identified. Forexample, the analysis may be of any kind sufficient to gain information,although in specific embodiments it comprises at least sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization,restriction enzyme digestion, a combination thereof, or other suitablemethods known in the art. In some embodiments concerning analysis ofmethylation status, substantially every CpG island may be cleaved, asopposed to some other methods in the art wherein cleavage occurs outsideCpG islands. In other embodiments, there are gaps in the library, suchas from DNA from non-cancerous cells, that represents a non-methylatedCpG island (promoter). In the corresponding cancerous DNA, there aresubstantially no gaps in the particular region representing a methylatedCpG island (such as in a promoter).

In other embodiments, libraries are generated from bisulfite-convertedDNA for the purpose of sequencing GC-rich regions and repetitive regionsof genomic DNA. GC-rich regions and repetitive elements are oftendifficult to accurately sequence due to the formation of secondarystructure and/or due to slippage of the polymerase duringpolymerization. Thus, bisulfite conversion of GC-rich regions willresult in modification of the sequence to remove secondary structures byconversion of C to T. Sequencing of both of strands of the converted DNAwill allow the comparison of the obtained converted sequences todetermine the original sequence. Similarly, partial bisulfite conversionof repetitive elements will result in changes in the sequence that willminimize secondary structure, thereby improving the sequencing resultsand allowing determination of the original sequence through comparisonof the sequences obtained from each strand. Furthermore, the partialconversion of GC-rich regions and repetitive elements can decreasestretches of homopolymeric cytosines and, therefore, result in improvedsequencing of regions that are susceptible to slippage duringpolymerization.

The information provided by the methods described herein is useful for avariety of applications. For example, the information may be utilized todevelop discovery tools that increase our understanding of themechanisms of disease progression, and/or diagnostic tools that allowthe early detection, diagnosis, treatment and/or post-treatmentmonitoring of disease, such as cancer.

In specific embodiments, the present invention regards a method foranalyzing a DNA molecule, comprisingobtaining at least one DNA moleculehaving one or more regions exhibiting differential characteristics;selectively modifying the at least one DNA molecule at the regionsexhibiting theone or more characteristics; incorporating at least oneknown sequence at both ends of the DNA molecule to produce at least onemodified molecule; amplifying the at least one modified molecule;andanalyzing the amplified molecule. In a specific embodiment, the atleast one DNA molecule comprises genomic DNA or is a genome. In anotherspecific embodiment, the differential characteristics compriseepigenetic modification, structure, sequence, association withnon-nucleotide factors, or a combination thereof. In a specificembodiment, the epigenetic modification comprises methylation. Inparticular embodiments, the altering comprises cleaving, and wherein thealtered molecule is further defined as comprising fragments. In aspecific embodiment, modifying comprises bisulfite conversion. Thecleaving step may comprise digestion with a methylation-sensitive enzymeand/or a methylation-specific enzyme. In another specific embodiment,the ends of the cleaved fragments are further defined as having at leastone known sequence, at least one known structure, or both. In anadditional embodiment, the at least one known sequence, at least oneknown structure, or both is the same for substantially all of the endsof the cleaved fragments. In a particular embodiment, the amplifyingstep utilizes a primer complementary to the known sequence, a primercomplementary to a desired sequence in the DNA molecule, or both.

In particular aspects of the invention, the incorporating step isfurther defined as subjecting the cleaved fragments to a plurality ofprimers to form a nucleic acid molecule/primer mixture, wherein theprimers comprise nucleic acid sequence that is substantiallynon-self-complementary and substantially non-complementary to otherprimers in the plurality, wherein the sequence comprises in a 5′ to 3′orientation a constant region and a variable region; andsubjecting thenucleic acid molecule/primer mixture to a DNA polymerase, underconditions wherein the subjecting steps generate a plurality ofmolecules comprising the constant region at each end. In a specificembodiment, the fragments do not comprise the at least one knowncleavage site. In another specific embodiment, the fragmentssubstantially exclude sequence surrounding the at least one knowncleavage site. In a specific embodiment, the sequence surrounding the atleast one known cleavage site is further defined as comprising at leastpart of at least one promoter. In an additional embodiment, the sequencesurrounding the at least one known cleavage site is further defined ascomprising at least part of at least one CpG island. In someembodiments, methods of the present invention further comprise the stepsofcleaving the amplified fragments in substantially the same manner ascleavage of the DNA molecule, thereby producing cleaved products;ligating an adaptor to the ends of the cleaved products; amplifying atleast some of the adaptor-ligated cleaved products; andanalyzing theamplified adaptor-ligated cleaved products. In a specific embodiment,the incorporating step is further defined as attaching a first adaptorhaving a nonblocked 3′ end to the ends of the cleaved fragments toproduce adaptor-linked fragments, wherein the 5′ end of the cleavedfragment is attached to the nonblocked 3′ end of the adaptor, leaving anick site between the juxtaposed 3′ end of the DNA and a 5′ end of thefirst adaptor; and extending the 3′ end of the cleaved fragment from thenick site. In a particular embodiment, prior to the attaching step themethod further comprisesrandomly fragmenting the cleaved fragments; andmodifying the ends of the cleaved fragments to provide attachable ends.In other embodiments, the method further comprises the steps ofcleavingthe amplified cleaved fragments in substantially the same manner ascleavage of the at least one DNA molecule, thereby producing cleavedproducts; ligating a second adaptor to the ends of the cleaved products;amplifying at least some of the second adaptor-ligated cleaved products;andanalyzing the amplified second adaptor-ligated cleaved products. In aspecific embodiment, cleaving of the at least one DNA molecule andcleaving of the amplified cleaved fragments comprises cleavage with amethylation-sensitive enzyme. In another specific embodiment, the secondadaptor comprises one or more known sequences.In another embodiment ofthe invention, the incorporating step is further defined assubjectingthe bisulfate converted molecules to a plurality of primers to form anucleic acid molecule/primer mixture, wherein the primers comprisenucleic acid sequence that is substantially non-self-complementary andsubstantially non-complementary to other primers in the plurality,wherein the sequence comprises in a 5′ to 3′ orientation a constantregion and a variable region; andsubjecting the nucleic acidmolecule/primer mixture to a DNA polymerase, under conditions whereinthe subjecting steps generate a plurality of molecules comprising theconstant region at each end. In a specific embodiment, the analyzingstep comprises sequencing, quantitative real-time polymerase chainreaction, ligation chain reaction, ligation-mediated polymerase chainreaction, probe hybridization, probe amplification, microarrayhybridization, restriction enzyme digestion, or a combination thereof.

In an additional embodiment of the present invention, there is a methodfor determining information from a DNA molecule, comprisingobtaining atleast one DNA molecule having one or more regions exhibitingdifferential characteristics; incorporating at least one known sequenceat the ends of fragments of the moleculeselectively modifying said DNAfragments at said regions according to said one or more characteristics;amplifying the modified fragments; andanalyzing the amplified alteredfragments. In a specific embodiment, the at least one DNA moleculecomprises genomic DNA or is a genome. In a specific embodiment, thedifferential characteristics comprise epigenetic modification,structure, sequence, association with non-nucleotide factors, or acombination thereof. In a specific embodiment, the differentialcharacteristic comprises epigenetic modification, such as methylation.The altering may comprise cleaving or bisulfite conversion, for example.In specific embodiments, the cleaving step comprisesmethylation-specific digestion and/or methylation-sensitive digestion.In a specific embodiment, the ends of the fragments are further definedas having at least one known sequence, at least one known structure, orboth. In another specific embodiment, the at least one known sequence,at least one known structure, or both is the same for substantially allof the ends of the cleaved fragments. The amplifying step may utilize aprimer complementary to the known sequence, a primer complementary to adesired sequence in a fragment, or both. In a specific embodiment, theincorporating step is further defined asrandomly fragmenting the cleavedfragments; modifying the ends of the cleaved fragments to provideattachable ends; attaching a first adaptor having a nonblocked 3′ end tothe ends of the DNA library fragments to produce first adaptor-linkedfragments, wherein the 5′ end of the library fragment is attached to thenonblocked 3′ end of the first adaptor, leaving a nick site between thejuxtaposed 3′ end of the DNA and a 5′ end of the first adaptor;andextending the 3′ end of the library fragment from the nick site. In aspecific embodiment, the method further comprises the steps ofcleavingsaid amplified cleaved fragments in substantially the same manner ascleavage of the at least one DNA molecule, thereby producing cleavedproducts; ligating a second adaptor to the ends of the cleaved products;amplifying at least some of the second adaptor-ligated cleaved products;andanalyzing the amplified second adaptor-ligated cleaved products. In aspecific embodiment, the analyzing step comprises sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization,restriction enzyme digestion, or a combination thereof.

In another embodiment, there is a method of determining methylationstatus of at least one sequence, comprising obtaining at least one DNAmolecule; digesting the at least one DNA molecule with amethylation-sensitive restriction enzyme; incorporating sequence at theends of the DNA fragments with at least one primer from a plurality ofprimers, said primer comprising a 5′ constant sequence and a 3′ variablesequence that is substantially non-self-complementary and substantiallynon-complementary to other primers in the plurality; amplifying one ormore DNA fragments utilizing a primer complementary to at least part ofthe constant sequence; and analyzing at least part of the sequence of atleast one amplified DNA fragment. In a specific embodiment, themethylation-sensitive restriction enzyme cleaves at a site comprising aCpG dinucleotide. In a specific embodiment, the methylation sensitiverestriction enzyme is BstUI, AciI, HpaII, HhaI, or a mixture thereof.The incorporating step may be further defined as generating singlestranded nucleic acid molecules from the DNA fragments; subjecting thesingle stranded DNA nucleic acid molecules to a plurality of primers toform a single stranded nucleic acid molecule/primer mixture, wherein theprimers comprise nucleic acid sequence that is substantiallynon-self-complementary and substantially non-complementary to otherprimers in the plurality and wherein the primers comprise a constantnucleic acid sequence and a variable nucleic acid sequence;andsubjecting said single stranded nucleic acid molecule/primer mixtureto a polymerase, under conditions wherein said subjecting steps generatea plurality of molecules comprising the constant nucleic acid sequenceat each end. In a specific embodiment, the polymerase is astrand-displacing polymerase. In another specific embodiment, theamplifying step comprises polymerase chain reaction. In a specificembodiment, the analyzing step comprises sequencing, quantitativereal-time polymerase chain reaction, ligation chain reaction,ligation-mediated polymerase chain reaction, probe hybridization, probeamplification, microarray hybridization, or a combination thereof. Themethod may further comprise the step of comparing at least part of thesequence of the amplified fragment with a control DNA molecule that wasnot subjected to the digestion step. In a specific embodiment, themethod further comprises digesting the amplified DNA fragments with themethylation-sensitive restriction enzyme; attaching an adaptor to atleast one digested amplified DNA fragment to produce an adaptor-linkedfragment, wherein the 5′ end of the digested amplified DNA fragment isattached to the nonblocked 3′ end of the adaptor, leaving a nick sitebetween the juxtaposed 3′ end of the DNA and a 5′ end of the adaptor;extending the 3′ end of the digested amplified DNA fragment from thenick site; amplifying the adaptor-linked fragments with a first primercomplementary to at least part of the adaptor to produce amplifiedadaptor-linked fragments; andanalyzing the amplified adaptor-linkedfragments to determine the methylation status of the original DNA. In aspecific embodiment, the adaptor comprises at least one end that iscomplementary to the ends of the digested amplified DNA fragments. In aspecific embodiment, the adaptor comprises at least one blunt end. Inanother specific embodiment, the adaptor comprises one or knownsequences, such as sequences are substantially non-self complementaryand do not substantially interact. In a specific embodiment, theamplifying step comprises polymerase chain reaction. In a furtherspecific embodiment, the analyzing step comprises sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization, or acombination thereof.

Another embodiment of the invention relates to a method for determiningmethylation status of a DNA molecule, comprisingobtaining at least oneDNA molecule; digesting the DNA molecule with a methylation-specificendonuclease; modifying the ends of the DNA fragments to incorporate alabel in at least one strand, thereby producing modified DNA fragmentsimmobilizing at least one modified DNA product through the label toproduce an immobilized DNA product; analyzing the immobilized DNAproduct to determine the methylation status of the original DNAmolecule. In a specific embodiment, the methylation-specificendonuclease is McrBC. In an additional specific embodiment, theincorporation of label utilizes DNA polymerase or terminal transferase,for example. In a specific embodiment, the label comprises an affinitytag, such as, for example, one that comprises at least one biotinmolecule. In a specific embodiment, the method further comprises thestep of randomly fragmenting the modified DNA fragments. The fragmentingstep may comprise chemical fragmentation by heat, for example. In anadditional specific embodiment, the analyzing step comprises sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization, or acombination thereof. In further embodiment, the quantitative real-timepolymerase chain reaction or ligation-mediated polymerase chain reactionuses a primer complementary to a desired region of the immobilized DNAproduct. In particular embodiments, the methods of the invention furthercomprisesubjecting the immobilized DNA product to a plurality of primersto form a nucleic acid molecule/primer mixture, wherein the primerscomprise nucleic acid sequence that is substantiallynon-self-complementary and substantially non-complementary to otherprimers in the plurality, wherein the sequence comprises in a 5′ to 3′orientation a constant region and a variable region; subjecting thenucleic acid molecule/primer mixture to a DNA polymerase, underconditions wherein the subjecting steps generate a plurality ofmolecules comprising the constant region at each end; amplifying atleast one of the molecules utilizing a primer comprising at least partof the constant region at both ends; and analyzing at least one of theamplified fragments to determine the methylation status of the originalDNA molecule. In a specific embodiment, the nucleic acid molecule issingle stranded. In another specific embodiment, the DNA polymerase is astrand-displacing polymerase. The amplifying step may comprisepolymerase chain reaction, for example. The analyzing step may comprisesequencing, quantitative real-time polymerase chain reaction, ligationchain reaction, ligation-mediated polymerase chain reaction, probehybridization, microarray hybridization, or a combination thereof. In aparticular aspect of the invention, there is amethod for determining themethylation status of a nucleic acid molecule, comprisingobtaining atleast one nucleic acid molecule; providing sodium bisulfite to thenucleic acid molecules, wherein the unmethylated cytosines in thenucleic acid molecules are converted to uracil, thereby producingbisulfite-converted single-stranded nucleic acid molecules; subjectingthe bisulfite-converted single stranded nucleic acid molecules to aplurality of primers having a constant region and a variable region toform a bisulfite-converted single stranded nucleic acid molecule/primermixture, wherein the primers comprisea first nucleic acid sequence thatis substantially non-self-complementary and substantiallynon-complementary to other primers in the plurality; anda second nucleicacid sequence that is substantially non-self-complementary andsubstantially non-complementary to other primers in the plurality andwherein the variable region is enriched in a particular nucleotide tospecifically target the bisulfite-converted single-stranded nucleic acidmolecules; subjecting the bisulfite-converted single stranded nucleicacid molecule/primer mixture to a polymerase, under conditions whereinthe subjecting step generates a plurality of molecules comprising theconstant region at each end; amplifying a plurality of the moleculescomprising the constant region at each end by utilizing a primercomplementary to at least part of the constant sequence, therebyproducing amplified molecules; andanalyzing the amplified molecules todetermine the methylation status of the original DNA molecule. In aspecific embodiment, the method further comprises the step of randomlyfragmented the bisulfite-converted nucleic acid molecules to producebisulfite-converted single-stranded nucleic acid fragments. In aspecific embodiment, the random fragmentation comprises chemicalfragmentation, such as by comprising heat, for example. In anotherspecific embodiment, the polymerase is a strand-displacing polymerase.In another specific embodiment, the amplifying step comprises polymerasechain reaction. In an additional specific embodiment, the analyzing stepcomprises sequencing, quantitative real-time polymerase chain reaction,ligation chain reaction, ligation-mediated polymerase chain reaction,probe hybridization, probe amplification, microarray hybridization, or acombination thereof. In a further specific embodiment, the quantitativereal-time polymerase chain reaction or ligation-mediated polymerasechain reaction comprises methylation-specific polymerase chain reaction.

In a particular embodiment of the invention, there is a method ofdetermining the methylation status of at least part of a DNA molecule,comprising the steps ofobtaining at least one DNA molecule; digestingthe DNA molecule with a methylation-sensitive restriction enzyme toproduce DNA fragments; attaching a first adaptor to the ends of thedigested fragments to produce first adaptor-linked fragments, whereinsaid attaching step comprises one or both of the following steps: (a)modifying the ends of the DNA fragments to provide attachable ends;attaching a first adaptor having a known sequence and a nonblocked 3′end to the ends of the modified DNA fragments to produce adaptor-linkedfragments, wherein the 5′ end of the modified DNA is attached to thenonblocked 3′ end of the adaptor, leaving a nick site between thejuxtaposed 3′ end of the DNA and a 5′ end of the adaptor; and extendingthe 3′ end of the modified DNA from the nick site; and(b) subjecting theDNA fragments to a mixture of adaptors comprising one or more type ofends, said ends comprising 3′ overhangs; 5′ overhangs; or blunt ends;extending the 3′ end of the modified DNA fragments from the nicksiteamplifying the first adaptor-linked fragments with a primercomplementary to the first adaptor; andanalyzing at least part of thesequence of the amplified first adaptor-linked fragments. In a specificembodiment, the first adaptor further comprises at least one of thefollowingabsence of a 5′ phosphate group; a 5′ overhang of about 7nucleotides in length; anda 3′ blocked nucletide. In an additionalspecific embodiment, the method further comprises the step ofincorporating a homopolymeric sequence to the ends of the firstadaptor-linked fragments. In a specific embodiment, the incorporatingstep comprises amplifying the first adaptor-linked fragments utilizing aprimer comprising a homopolymeric sequence at its 5′ end (such ascomprising cytosines); or utilizing terminal transferase activity at the3′ ends of the amplified first adaptor-linked fragments, for example.

In an additional specific embodiment, the analyzing step comprisessequencing, quantitative real-time polymerase chain reaction, ligationchain reaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization, or acombination thereof. In an additional specific embodiment, the analyzingstep comprises the comparison of amplified adaptor-linked fragments frommethylation-sensitive digested DNA molecules and undigested DNAmolecules. In an additional specific embodiment, the method=furthercomprises the steps of digesting the amplified homopolymericsequence-comprising adaptor-linked fragments with themethylation-sensitive restriction enzyme; attaching a second adaptor tothe ends of the digested amplified homopolymeric sequence-comprisingadaptor-linked fragments to produce secondary adaptor-linked fragments;amplifying the secondary adaptor-linked fragments with a first primercomplementary to the second adaptor and a second primer complementary tothe homopolymeric sequence of the second adaptor-linked fragments;andanalyzing at least part of the sequence of the amplified secondaryadaptor-linked fragments. In an additional specific embodiment, the endsof the digested amplified homopolymeric sequence-comprisingadaptor-linked fragments are modified to produce attachable ends. Inanother specific embodiment, the second adaptor is comprised of at leastone blunt end or the second adaptor is comprised of overhangscomplementary to the ends of the digested amplified homopolymericsequence-comprising adaptor-linked fragments, for example. The analyzingstep may comprise quantitative real-time polymerase chain reaction,ligation chain reaction, ligation-mediated polymerase chain reaction,probe hybridization, probe amplification, microarray hybridization, or acombination thereof.

In another aspect of the invention, there is a method of determining themethylation status of at least part of a DNA molecule, comprising thesteps ofobtaining at least one DNA molecule; digesting the DNA moleculewith a methylation-sensitive restriction enzyme to produce DNAfragments; randomly fragmenting the digested DNA fragments; modifyingthe ends of the digested DNA fragments to produce modified DNA fragmentswith attachable ends; attaching a first adaptor to the ends of themodified DNA fragments to produce first adaptor-linked fragments,wherein the 5′ end of the modified DNA is attached to the nonblocked 3′end of the first adaptor, leaving a nick site between the juxtaposed 3′end of the DNA and a 5′ end of the first adaptor; extending the 3′ endof the modified DNA fragment from the nick site; amplifying the firstadaptor-linked fragments with a primer complementary to at least part ofthe first adaptor; andanalyzing at least part of the sequence of theamplified first adaptor-linked fragments to determine the methylationstatus of the original DNA molecule. In a specific embodiment, the firstadaptor comprises at least one of the following absence of a 5′phosphate group; a 5′ overhang of about 7 nucleotides in length; and a3′ blocked nucletide. In a specific embodiment, the amplifying stepcomprises polymerase chain reaction. In another specific embodiment, themethod further comprises the step of incorporating a homopolymericsequence to the ends of the amplified first adaptor-linked fragments toproduce amplified homopolymeric sequence comprising first adaptor-linkedfragments. The incorporating step may comprise amplifying the firstadaptor-linked fragments utilizing a primer comprising a homopolymericsequence at its 5′ end, or it may comprise utilizing terminaltransferase activity at the 3′ ends of the amplified firstadaptor-linked fragments, for example. In another specific embodiment,the analyzing step comprises sequencing, quantitative real-timepolymerase chain reaction, ligation chain reaction, ligation-mediatedpolymerase chain reaction, probe hybridization, probe amplification,microarray hybridization, or a combination thereof.

In an additional embodiment, the method further comprises the stepsofdigesting the amplified homopolymeric sequence-comprising firstadaptor-linked fragments with the methylation sensitive restrictionenzyme; ligating a second adaptor to the ends of the digested amplifiedhomopolymeric sequence-comprising adaptor-linked fragments to producesecond adaptor-linked fragments, wherein the 5′ end of the modified DNAis attached to the nonblocked 3′ end of the second adaptor, leaving anick site between the juxtaposed 3′ end of the digested amplifiedhomopolymeric sequence-comprising adaptor-linked fragments and a 5′ endof the second adaptor; extending the 3′ end of the digested amplifiedhomopolymeric sequence-comprising adaptor-linked fragments from the nicksite; amplifying the second adaptor-linked fragments with a first primercomplementary to at least part of the second adaptor and a second primercomplementary to at least part of the homopolymeric sequence;andanalyzing at least part of the sequence of the amplified secondadaptor-linked fragments to determine the methylation status of theoriginal DNA molecule. In a specific embodiment, the second adaptorcomprises at least one end complementary to the ends produced bydigesting the amplified homopolymeric sequence-comprising firstadaptor-linked fragments. In a specific embodiment, the second adaptorcomprises one or more known sequences. In another specific embodiment,the one or more known sequences do not substantially interact. In anadditional embodiment, the amplifying step comprises polymerase chainreaction. In a further specific embodiment, the analyzing step comprisessequencing, quantitative real-time polymerase chain reaction, ligationchain reaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization, or acombination thereof.

In a particular aspect of the invention, there is a method for preparinga DNA molecule, comprisingobtaining at least one DNA molecule; digestingthe at least one DNA molecule with a methylation-specific endonuclease;attaching an adaptor having a known sequence and a nonblocked 3′ end tothe ends of the digested fragments to produce adaptor-linked fragments,wherein the 5′ end of the digested fragment is attached to thenonblocked 3′ end of the adaptor, leaving a nick site between thejuxtaposed 3′ end of the digested fragment and a 5′ end of the adaptor;amplifying at least one adaptor-linked fragment using a primer that iscomplementary to at least part of the adaptor to produce size-selectedadaptor-linked products; andanalyzing at least one of the size-selectedadaptor-linked products to determine the methylation status of theoriginal DNA. In a specific embodiment, the methylation-specificendonuclease is McrBc. In a further specific embodiment, the adaptorcomprises a 1 to about 6 base pair 5′ N base overhang. In an additionalspecific embodiment, the ends of the DNA fragments are modified toprovide attachable ends. In a further specific embodiment, the adaptorcomprises at least one blunt end and/or the adaptor comprises one ormore known sequences. In a specific embodiment, the one or more knownsequences are substantially non-interactive. In a specific embodiment,the amplifying of the at least one adaptor-linked fragment comprisessize-selective polymerase chain reaction. In a specific embodiment, thesize-selective polymerase chain reaction comprises utilization of ashort polymerization step, such as one that comprises about 5 seconds toabout 20 seconds or that comprises about 10 seconds, for example. In aspecific embodiment, the short polymerization step results in ampliconsof between about 30 bp and about 200 bp. In an additional specificembodiment, the adaptor-linked DNA fragments are size-fractionated byphysical means prior to the amplifying step, the size fractionationcomprises filtration, or the fractionation comprises membraneultrafiltration, for example. In specific embodiments, the digested DNAfragments are size-fractionated by physical means prior to attachment ofthe adaptor. In particular embodiments, the size fractionation comprisesfiltration, or membrane ultrafiltration, for example. In a specificembodiment, the amplifying step comprises polymerase chain reaction. Ina specific embodiment, the analyzing step comprises sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization, or acombination thereof.

In an additional aspect of the invention, there is a method forpreparing a DNA molecule, comprisingobtaining at least one DNA molecule;digesting the DNA molecule with a methylation-specific endonuclease toproduce DNA fragments; attaching an adaptor to the ends of the digestedDNA fragments to provide a nick translation initiation site, therebyproducing adaptor-linked fragments; andsubjecting the adaptor-linkedfragments to nick translation to produce nick translate molecules. In aspecific embodiment, the methylation-specific endonuclease is McrBC. Ina specific embodiment, the adaptors comprise a mixture of primerscomprising 1 to about 6 bp 5′ N overhangs. In another specificembodiment, the ends of the digested DNA fragments are modified toprovide attachable ends. In an additional specific embodiment, theadaptor comprises at least one blunt end and/or the adaptor comprises alabel, such as a 5′ label and/or an affinity tag, such as one thatcomprises at least one biotin molecule. The method may further comprisethe step of immobilizing the nick translate molecules through the label.In specific embodiments, the immobilizing step further comprisesdenaturation of the nick translate molecules. In a particular aspect ofthe invention, the adaptor comprises a constant sequence comprising a 5′affinity tag on one strand, and a 5′ phosphate and a 3′ blocked group onthe second strand. In another specific embodiment, a 3′ end of themodified DNA fragment is attached to the 5′ phosphorylated end of theadaptor, thereby leaving a nick between the juxtaposed 5′ end of the DNAand the 3′ end of the adaptor. In an additional specific embodiment, thesecond strand comprises an internal nick. In a further specificembodiment, a 3′ end of the digested DNA fragment is attached to the 5′phosphorylated end of the adaptor, thereby leaving a first nick in themiddle of the non-ligated adaptor sequence and a second nick between thejuxtaposed 5′ end of the DNA and the 3′ end of the adaptor. The methodmay be further defined as determining the methylation status of the DNAmolecule, and comprisingamplifying at least one of the nick translatemolecules to produce amplified nick translate molecules; andanalyzingthe amplified nick translate molecules. In a specific embodiment, theanalyzing step comprises analyzing at least one amplified nick translatemolecule for at least one sequence adjacent to a cleavage site of therestriction endonuclease. The analyzing step may comprise sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, microarray hybridization, or a combination thereof. Theanalyzing step may comprise comparison of the amplified molecule with aDNA molecule that was not subjected to digestion with themethylation-specific endonuclease. In a particular aspect of theinvention, the method is further defined as determining the methylationstatus of the DNA molecule and comprisingsubjecting the immobilizeddenatured molecules to a plurality of primers to form a nucleic acidmolecule/primer mixture, wherein the primers comprise a nucleic acidsequence that is substantially non-self-complementary and substantiallynon-complementary to other primers in the plurality, wherein saidsequence comprises in a 5′ to 3′ orientation a constant region and avariable region; subjecting said single stranded nucleic acidmolecule/primer mixture to a polymerase, under conditions wherein thesubjecting steps generate a plurality of molecules comprising the knownnucleic acid sequence at each end; amplifying at least one of themolecules comprising the constant region at both ends; andanalyzing atleast one of the amplified fragments to determine the methylation statusof the original DNA molecule. The amplifying step may comprisepolymerase chain reaction, such as one that utilizes a primercomplementary to at least part of the constant region. In a specificembodiment, the analyzing step comprises sequencing, quantitativereal-time polymerase chain reaction, ligation chain reaction,ligation-mediated polymerase chain reaction, probe hybridization, probeamplification, microarray hybridization, or a combination thereof.

In a particular embodiment of the invention, there is a method forpreparing a DNA molecule, comprisingobtaining at least one DNA molecule;digesting the DNA molecule with a methylaton-specific endonuclease;attaching a first adaptor having a first known sequence and a nonblocked3′ end to the ends of the digested DNA fragments to produceadaptor-linked fragments, wherein the 5′ end of the digested DNAfragment is attached to the nonblocked 3′ end of the adaptor, leaving anick between the juxtaposed 3′ end of the digested DNA fragment and the5′ end of the adaptor; extending the 3′ end of the adaptor-linkedfragment from the nick site; randomly fragmenting the adaptor-linkedfragments to produce fragmented molecules; modifying the ends of thefragmented molecules to provide attachable ends, thereby producingmodified fragmented molecules; attaching a second adaptor having asecond known sequence and a nonblocked 3′ end to the ends of themodified fragmented molecules to produce adaptor-linked modifiedfragmented molecules, wherein the 5′ end of the modified fragmentedmolecule is attached to the nonblocked 3′ end of the second adaptor,leaving a nick site between the juxtaposed 3′ end of the modifiedfragmented molecule and the 5′ end of the second adaptor; extending the3′ end of the adaptor-linked modified fragmented molecules from the nicksite to produce extended adaptor-linked modified fragmented molecules;amplifying at least one of the extended adaptor-linked modifiedfragmented molecules; andanalyzing the amplified molecules to determinethe methylation status of the original DNA molecule. In a specificembodiment, the methylation-specific endonuclease is McrBC.The firstadaptor may comprise a mixture of primers comprising 1 to about 6 basepair 5′ N base overhangs. In a specific embodiment, the first adaptorcomprises at least one blunt end. In another specific embodiment, thesecond adaptor comprises at least one blunt end. In an additionalspecific embodiment, the first and second adaptors comprise the samesequence. In an additional specific embodiment, the amplifying stepcomprises polymerase chain reaction, such as one that comprises a primerdirected to at least part of the sequence of the first adaptor, at leastpart of the sequence of the second adaptor, or a mixture thereof, forexample. In a specific embodiment, the analysis comprises sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, microarray hybridization, or a combination thereof

In one aspect of the invention, there is a method for determining themethylation status of a DNA molecule, comprising; obtaining at least oneDNA molecule; digesting the DNA molecule with a methylation-specificendonuclease; providing an adaptor comprising: a known sequence; andanonblocked 3′ end; attaching the adaptor to the ends of the digested DNAfragments to produce adaptor-linked fragments, wherein the 5′ end of thedigested DNA fragment is attached to the nonblocked 3′ end of theadaptor, leaving a nick site between the juxtaposed 3′ end of thedigested DNA fragment and a 5′ end of the adaptor; extending the 3′ endof the modified DNA fragment from the nick site; amplifying at least aportion of the modified DNA fragment to produce at least oneamplification product; andanalyzing at least one amplification productto determine the methylation status of the original DNA molecule. In aspecific embodiment, the methylation-specific endonuclease is McrBC.Theadaptor may comprise a mixture of primers comprising 1 to about 6 basepair 5′ N overhangs. In a specific embodiment, the ends of the digestedDNA fragments are modified to provide attachable ends. In anotherspecific embodiment, the adaptor comprises at least one blunt-end. In afurther specific embodiment, the amplification primer comprises ahomopolymeric sequence. In an additional specific embodiment, theadaptor comprises homopolymeric sequence of cytosines. In anotherspecific embodiment, the adaptor-attached DNA fragments comprise ahomopolymeric sequence added to the 3′ end enzymatically, and thehomopolymeric sequence may be added by terminal transferase and/or maycomprise guanines, for example. In an additional specific embodiment,the amplifying step and/or the analyzing step comprises polymerase chainreaction.

The analyzing step may comprise polymerase chain reaction that utilizesa first primer complementary to the homopolymeric region and a secondprimer complementary to a desired sequence in the amplified DNAfragment, for example, and the primer may be complementary to thehomopolymeric regions comprises cytosines, for example.

In one aspect of the invention, there is a method of determining themethylation status of at least part of at least one DNA molecule,comprising the steps ofobtaining the DNA molecule; attaching a firstadaptor to the ends of the DNA molecule to produce first adaptor-linkedmolecules, wherein said first adaptor comprises homopolymeric sequenceand said attaching step comprises one or both of the following steps(a)modifying the ends of the DNA molecules to provide attachable ends;attaching a first adaptor having a known sequence and a nonblocked 3′end to the ends of the DNA molecules to produce adaptor-linkedmolecules, wherein the 5′ end of the DNA is attached to the nonblocked3′ end of the adaptor, leaving a nick site between the juxtaposed 3′ endof the DNA molecule and a 5′ end of the adaptor; or (b) subjecting theDNA molecules to a mixture of adaptors comprising one or more type ofends, said ends comprising 3′ overhangs; 5′ overhangs; or blunt ends; toproduce adaptor-linked molecules, wherein the 5′ end of the DNA moleculeis attached to the nonblocked 3′ end of the adaptor, leaving a nick sitebetween the juxtaposed 3′ end of the DNA molecule and a 5′ end of theadaptor; extending the ends of the DNA molecule from the nick site;digesting the first adaptor-linked molecules with a methylation specificrestriction endonuclease; attaching a second adaptor to the ends of thedigested first adaptor-linked DNA fragments to produce secondadaptor-linked DNA fragments, wherein the 5′ end of the firstadaptor-linked DNA fragment is attached to the nonblocked 3′ end of theadaptor, leaving a nick site between the juxtaposed 3′ end of the firstadaptor-linked DNA fragment and a 5′ end of the adaptor; amplifying thesecond adaptor-linked molecules utilizing a primer mixture comprising afirst primer that is complementary to at least part of the secondadaptor and a second primer that is complementary to at least part ofthe homopolymeric tail; andanalyzing the amplified second adaptor-linkedfragments to determine the methylation status of the original DNAmolecule. In a specific embodiment, the first adaptor comprises ahomopolymeric tail, such as one that comprises cytosines, for example.In a specific embodiment, the first adaptor-linked fragments comprise ahomopolymeric sequence that is attached enzymatically. In anotherspecific embodiment, the enzymatic attachment of the homopolymericsequence comprises terminal transferase activity. In an additionalspecific embodiment, the homopolymeric sequence comprises guanines. Inan additional specific embodiment, the methylation specific endonucleasecomprises McrBC. In another specific embodiment, the second adaptorcomprises a mixture of 1 to about 6 base pair 5′ N base overhangs. Thesecond adaptor may comprise more than one known sequence that issubstantially non-self complementary and substantially non interactive,for example. In an additional embodiment, the amplifying step comprisespolymerase chain reaction. The analyzing step may comprise sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization, or acombination thereof. In a specific embodiment, the DNA molecule isobtained from plasma or serum.

Another embodiment of the invention concerns a method for determiningthe methylation status of a nucleic acid molecule, comprisingobtainingat least one nucleic acid molecule; randomly fragmenting the nucleicacid molecule to produce fragmented molecules; modifying the ends of theDNA fragments to provide attachable ends, thereby producing modified DNAfragments; attaching a first adaptor to the ends of the modified DNAfragments to produce adaptor-linked fragments, wherein the 5′ end of themodified DNA fragment is attached to the nonblocked 3′ end of the firstadaptor, leaving a nick site between the juxtaposed 3′ end of themodified DNA fragment and a 5′ end of the first adaptor; extending the3′ end of the adaptor-linked fragments from the nick site; providingsodium bisulfate to said adaptor-linked fragments, wherein theunmethylated cytosines in said nucleic acid molecules are converted touracil, thereby producing bisulfate-converted molecules; amplifying aplurality of the bisulfate-converted molecules utilizing a primercomplementary to at least part of the adaptor, thereby producingamplified molecules; andanalyzing the amplified molecules to determinethe methylation status of the original DNA molecule. In a specificembodiment, the random fragmentation comprises chemical fragmentation,such as comprising heat, and/or the fragmentation comprises mechanicalfragmentation. In a specific embodiment, the attached strand of theadaptor sequence does not comprise guanine and all cytosines aremethylated. In an alternative embodiment, the attached strand of theadaptor sequence does not comprise cytosine. In a specific embodiment,the extension of the 3′ nick site is performed in the presence ofguanine, adenine, thymine, and methylated cytosine. In a specificembodiment, the amplifying step comprises polymerase chain reaction. Inanother specific embodiment, the analyzing step comprises sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization, or acombination thereof. In a specific embodiment, the quantitativereal-time polymerase chain reaction or ligation-mediated polymerasechain reaction comprises methylation-specific polymerase chain reaction.

In another aspect of the invention, there is a method for preparing aDNA molecule, comprisingobtaining at least one DNA molecule; randomlyfragmenting the DNA molecule to produce DNA fragments; modifying theends of the DNA fragments to provide attachable ends, thereby producingmodified DNA fragments; attaching an adaptor having a known sequence anda nonblocked 3′ end to the ends of the modified DNA fragment to produceadaptor-linked fragments, wherein the 5′ end of the modified DNAfragment is attached to the nonblocked 3′ end of the adaptor, leaving anick site between the juxtaposed 3′ end of the modified DNA fragment anda 5′ end of the adaptor; extending the 3′ end of the modified DNAfragment from the nick site; digesting at least some of the amplifiedadaptor-linked fragments with a methylation-specific endonuclease;amplifying at least one of the adaptor-linked fragments that were notdigested by the methylation-specific endonuclease, thereby producing anamplified undigested adaptor-linked fragment, said amplifying using aprimer complementary to the adaptor; andanalyzing at least one amplifiedundigested adaptor-linked fragment to determine the methylation statusof the original DNA molecule. In a specific embodiment, the randomfragmentation comprises chemical fragmentation and/or comprisesmechanical fragmentation. In a specific embodiment, the adaptorcomprises at least one blunt end. In another specific embodiment, themethylation-specific endonuclease is McrBC. In particular embodiments,the amplification step comprises polymerase chain reaction. Inadditional particular embodiments, the analyzing step comprisessequencing, quantitative real-time polymerase chain reaction, ligationchain reaction, ligation-mediated polymerase chain reaction, probehybridization, microarray hybridization, or a combination thereof. In aspecific embodiment, the analyzing step comprises comparing at least onedigested amplified adaptor-linked fragment with at least one undigestedamplified adaptor-linked fragment.

In a specific aspect of the invention, the methods and compositionsprovided herein regard detection, such as diagnosis, of cancer,prognosis of cancer, differentiation of aggressive vs. non-aggressivecancer, monitoring of progression of cancer and/or the drug effects oncancer, determination of susceptibility to developing cancer in anindividual, and/or determining resistance to cancer therapy and/orsusceptibility to developing a resistance to cancer therapy. Inparticular embodiments, at least one sample from an individual suspectedof having cancer or that has cancer is subjected to a method of theinvention such that a diagnosis, prognosis, or characterization can bemade. In a specific embodiment, the methylation status of at least oneDNA molecule from an individual suspected of having or developing canceror from an individual that is known to have cancer but desiresadditional information of the cancer, such as the tissue that itoriginates from, whether it has metastasized, and/or the staging of thecancer, is determined. The sample may originate from any tissue orsource of the individual, but in particular embodiments it comes fromblood, serum, urine, cheek scrapings, nipple aspirate, biopsy, feces,saliva, sweat, or cerobrospinal fluid, for example.

Thus, in specific embodiments, upon determination of a sample wherein itis determined that at least part of the sequence of at least one DNAmolecule is hypermethylated, it is indicated that the individual issusceptible to developing cancer or has cancer. In embodiments whereinupon determination it is determined that at least part of the sequenceof at least one DNA molecule is hypomethylated, it is indicated that theindividual is not susceptible to cancer and/or does not have cancer. Thepart of the sequence may comprise a CpG island, a promoter, or both, forexample.

The cancer for which a sample from an individual is suspected of havingor already has may be of any cancer. In specific embodiments, the canceris lung, breast, head and neck, prostate, brain, liver, pancreatic,ovarian, spleen, skin, bone, thyroid, kidney, throat, cervical,testicular, melanoma, leukemia, esophageal, or colon, for example.

As such, in a particular embodiment of the invention there is a kithoused in a suitable container that comprises one or more compositionsof the present invention for diagnosis, prognosis, and/orcharacterization of cancer from one or more individuals.

The methods of the present invention can be used for the detection andanalysis of a broad range of pathological conditions and physiologicalprocesses, for example. Clinical applications can include but are notlimited to the following: diagnosis and/or prognosis of cancer, immunedisorders, toxicity, central nervous system disorders, proliferativedisorders, metabolic malfunctions and disorders, infection,inflammation, cardio-vascular disease, developmental abnormalities,pre-natal diagnosis, etc.

In other embodiments of the invention, methods and compositions areutilized for applications, such as for research applications, forexample, for the study of normal physiological processes including thefollowing: control of gene expression, gene silencing and imprinting, Xchromosome inactivation, growth and development, ageing, and tissue andcell type-specific gene expression.

In particular aspects, the methods described herein providenon-invasive, rapid, sensitive and economical ways to detectmethylation. They are easy to automate and apply in a high-throughputsetting for disease diagnostics, research, and/or discovery of newmethylation markers for cancer and other medical conditions.

In one embodiment of the invention, there is a method of preparing a DNAmolecule, comprising (a) providing a DNA molecule; (b) digesting the DNAmolecule with at least one methylation-sensitive restriction enzyme; (c)incorporating a nucleic acid molecule (which may be referred to asincorporating nucleic acid sequence) onto ends of the DNA fragments toprovide first modified DNA molecules, by one of the following: (1)incorporating at least one primer from a plurality of primers, saidprimers comprising a 5′ constant sequence and a 3′ variable sequencethat is substantially non-self-complementary and substantiallynon-complementary to other primers in the plurality; or (2)incorporating an adaptor comprising an inverted repeat and a loop, underconditions wherein the adaptor becomes blunt-end ligated to one strandof the fragment, thereby producing an adaptor-linked fragment comprisinga nick having a 3′ hydroxyl group, wherein there is polymerization fromthe 3′ hydroxyl group of at least part of the adaptor-linked fragment;and (d) amplifying one or more of the first modified DNA molecules toprovide amplified modified DNA molecules.

In specific embodiments of the method, the incorporating step comprisesincorporating at least one primer from a plurality of primers, saidprimers comprising a 5′ constant sequence and a 3′ variable sequencethat is substantially non-self-complementary and substantiallynon-complementary to other primers in the plurality. In other specificembodiments, the incorporating step comprises incorporating a firstadaptor having a nonblocked 3′ end to produce first adaptor-linkedfragments, wherein the 5′ end of the digested fragment is attached tothe nonblocked 3′ end of the adaptor, leaving a nick site between thejuxtaposed 3′ end of the fragment and a 5′ end of the first adaptor, andextending the 3′ end of the fragment from the nick site. In a furtherspecific embodiment, the incorporating step comprises incorporating anadaptor comprising an inverted repeat and a loop, under conditionswherein the adaptor becomes blunt-end ligated to one strand of thefragment, thereby producing an adaptor-linked fragment comprising a nickhaving a 3′ hydroxyl group, wherein there is polymerization from the 3′hydroxyl group of at least part of the adaptor-linked fragment.

In specific embodiments of the invention, the method further comprisesanalyzing at least part of the sequence of an amplified modified DNAmolecule. In further specific embodiments, the DNA molecule that isprovided comprises genomic DNA, such as a comprising a genome. Theprovided DNA molecule may be provided from a body fluid, such as blood,serum, urine, cerebrospinal fluid, nipple aspirate, sweat, or saliva, orfrom a tissue, such as biopsy, surgical sample, cheek scrapings, orfeces. In a particular aspect of the invention, the DNA molecule that isprovided is from a sample of an individual that has a medical condition,such as cancer, for example. In another specific embodiment, themethylation-sensitive restriction enzyme has a 4-5 base pair recognitionsite that comprises at least one CpG dinucleotide, and exemplaryembodiments include Aci I, Bst UI, Hha I, HinP1, Hpa II, Hpy 99I, Ava I,Bce AI, Bsa HI, Bsi E1, Hga I, or a mixture thereof.

In one aspect of the invention, the incorporating step may be furtherdefined as generating single stranded nucleic acid molecules from theDNA fragments; subjecting the single stranded nucleic acid molecules toa plurality of primers to form a single stranded nucleic acidmolecule/primer mixture, wherein the primers comprise nucleic acidsequence that is substantially non-self-complementary and substantiallynon-complementary to other primers in the plurality and wherein theprimers comprise a constant nucleic acid sequence and a variable nucleicacid sequence; and subjecting said single stranded nucleic acidmolecule/primer mixture to a polymerase to generate a plurality ofmolecules comprising the constant nucleic acid sequence at each end.

In another aspect of the invention, the incorporating step may befurther defined as providing in a single incubation the following: atleast one DNA fragment; a hairpin adaptor comprising an inverted repeatand a loop; DNA polymerase comprising 3′-5′ exonuclease activity;uracil-DNA-glycosylase; DNA ligase; dNTPs; ATP; and a buffer suitablefor activity of the polymerase, glycosylase, and ligase. In a specificembodiment, the incubation further comprises a mixture ofmethylation-sensitive restriction enzymes and wherein said buffer isfurther suitable for activity of the restriction enzymes. In anotherspecific embodiment, the inverted repeat comprises at least onereplication stop, such as one generated in the synthesis of the hairpinadaptor, for example by incorporation of a non-replicable base analog,or the replication stop may be generated by converting deoxyuridine toan abasic site, such as with the enzyme uracil-DNA-glycosylase.

In additional aspects of the invention, the appropriate methods of theinvention further comprising digesting the amplified first modified DNAmolecules with the at least one methylation-sensitive restrictionenzyme; incorporating a nucleic acid molecule onto ends of the amplifiedfirst modified fragments to provide second modified DNA molecules, byone of the following: (1) incorporating a second adaptor having anonblocked 3′ end to produce second adaptor-linked fragments, whereinthe 5′ end of the fragment is attached to the nonblocked 3′ end of thesecond adaptor, leaving a nick site between the juxtaposed 3′ end of thefragment and a 5′ end of the second adaptor, and extending the 3′ end ofthe molecule from the nick site; or (2) incorporating an adaptorcomprising an inverted repeat and a loop, under conditions wherein theadaptor becomes blunt-end ligated to one strand of the fragment, therebyproducing an adaptor-linked fragment comprising a nick having a 3′hydroxyl group, wherein there is polymerization from the 3′ hydroxylgroup of at least part of the adaptor-linked DNA fragment; andamplifying the second modified DNA molecules to provide amplified secondmodified DNA molecules.

In specific aspects of the invention, methods comprise analyzing theamplified second modified DNA molecules to determine the methylationstatus of the provided DNA molecule. Methods of the invention may alsofurther comprise the step of heating the second modified DNA moleculesand/or the second adaptor-linked fragments, wherein the extension in thesecond adaptor-linked fragment has not occurred, to a temperature thatcauses denaturation of a specific fraction of the DNA.

In specific embodiments of the invention, the incorporating step (2) isfurther defined as providing in a single incubation the following: atleast one amplified first modified fragment; a hairpin adaptorcomprising an inverted repeat and a loop; DNA polymerase comprising3′-5′ exonuclease activity; uracil-DNA-glycosylase; DNA ligase; dNTPs;ATP; and a buffer suitable for activity of the polymerase, glycosylase,and ligase. In a specific embodiment, the incubation further comprises amixture of methylation-sensitive restriction enzymes and wherein thebuffer is further suitable for activity of the restriction enzymes. In aspecific embodiment, the method is further defined as determining themethylation status of at least part of the provided DNA molecule and/ormay be further defined as performing the method with a provided moleculefrom a sample of an individual with a medical condition in comparison toa control. The provided DNA molecule may comprise a promoter, a CpGisland, or both, in particular aspects of the invention, and/or it mayalso be further defined as bisulfate-converted DNA.

In another aspect of the invention, there is a method of preparing a DNAmolecule, comprising: (a) providing a DNA molecule; (b) digesting themolecule with one or more methylation-specific restriction enzymes toprovide DNA fragments; (c) incorporating a nucleic acid molecule ontothe ends of the DNA fragments to provide first modified DNA molecules,by a method comprising: (1) incorporating at least one primer from aplurality of primers, said primer comprising a 5′ constant sequence anda 3′ variable sequence that is substantially non-self-complementary andsubstantially non-complementary to other primers in the plurality; (2)incorporating a first adaptor having a nonblocked 3′ end to producefirst adaptor-linked fragments, wherein the 5′ end of the digestedfragment is attached to the nonblocked 3′ end of the adaptor, leaving anick site between the juxtaposed 3′ end of the fragment and a 5′ end ofthe first adaptor, and extending the 3′ end of the fragment from thenick site; or (3) incorporating an adaptor comprising an inverted repeatand a loop, under conditions wherein the adaptor becomes blunt-endligated to one strand of the fragment, thereby producing anadaptor-linked fragment comprising a nick having a 3′ hydroxyl group,wherein there is polymerization from the 3′ hydroxyl group of at leastpart of the adaptor-linked fragment; and (d) amplifying at least onefirst modified DNA molecule to provide amplified DNA molecules. In aspecific aspect, the amplifying step utilizes a primer that iscomplementary to the incorporated sequence. In another specific aspect,the method further comprises the step of analyzing at least one of theamplified first modified DNA molecules to determine the methylationstatus of the provided DNA. The methylation-specific endonuclease may beMcrBc, in specific aspects.

In an additional embodiment of the invention, there is a method ofpreparing a DNA molecule, comprising: (a) providing one or more nucleicacid molecules; (b) incorporating a nucleic acid molecule at the ends ofthe molecules by one or more of the following, wherein the incorporatedmolecule is resistant to bisulfate conversion, to provide first modifiedDNA molecules: (1) incorporating sequence by attaching a first adaptorhaving a nonblocked 3′ end to the ends of the molecule to produce firstadaptor-linked molecules, wherein the 5′ end of the molecule is attachedto the nonblocked 3′ end of the adaptor, leaving a nick site between thejuxtaposed 3′ end of the molecule and a 5′ end of the first adaptor, andextending the 3′ end of the molecule from the nick site; or (2)incorporating sequence by providing an adaptor comprising an invertedrepeat and a loop, under conditions wherein the adaptor becomesblunt-end ligated to one strand of the DNA molecule, thereby producingan adaptor-linked DNA molecule comprising a nick having a 3′ hydroxylgroup, wherein there is polymerization from the 3′ hydroxyl group of atleast part of the adaptor-linked DNA molecule; (c) providing sodiumbisulfite to said first modified nucleic acid molecules, wherein theunmethylated cytosines in said nucleic acid molecules are converted touracil, thereby producing bisulfite-converted single-stranded nucleicacid molecules; and (d) amplifying one or more of thebisulfite-converted molecules.

Particular methods of the invention may further comprise the step ofanalyzing the amplified bisulfite-converted molecules to determine themethylation status of the provided DNA molecule. The method may also befurther defined as performing the method with a provided moleculesuspected of being from a cancerous sample in comparison to a control.In particular aspects, the method further comprises digesting thenucleic acid molecules, the first modified DNA molecules, or thebisulfite-converted molecules with a methylation-sensitive restrictionenzyme. In specific embodiments, the method further comprises analyzingthe digested nucleic acid molecules, the digested first modified DNAmolecules, or the digested bisulfite-converted molecules to determinethe methylation status of the provided nucleic acid molecules.

In an additional aspect of the invention, there is a method of preparinga DNA molecule, comprising the steps of: (1) providing a DNA molecule;(2) altering the molecule in a single incubation to produceadaptor-linked molecules, said incubation comprising two of more of thefollowing: (a) modifying the ends of the DNA molecules to provideattachable ends; (b) repairing nicks and/or gaps within the DNAmolecules; (c) attaching a first hairpin adaptor comprising a knownsequence and a nonblocked 3′ end to the ends of the DNA molecules toproduce adaptor-linked molecules, wherein the 5′ end of the DNA isattached to the nonblocked 3′ end of the adaptor, leaving a nick sitebetween the juxtaposed 3′ end of the DNA molecule and a 5′ end of theadaptor; and (d) extending the 3′ end of the DNA molecules from the nicksite; (3) digesting the adaptor-linked DNA molecules with a mixture ofmethylation-sensitive restriction enzymes that do not cleave within theattached first adaptor; and (4) amplifying the digested firstadaptor-linked DNA molecules with a primer complementary to at least aportion of the stem region of the first adaptor to produce amplifiedadaptor-linked fragments.

The digestion of the DNA molecules with the mixture ofmethylation-sensitive restriction enzymes may occur during the alteringstep. The methylation-sensitive restriction enzyme cleaves at a sitecomprising a CpG dinucleotide, and a mixture of exemplarymethylation-specific restriction enzymes includes Aci I, BstU I, Hha I,HinP1 I, HpaII, Hpy99 I, Ava I, Bce AI, Bsa HI, Bsi E1, Hga I, or amixture of at least two thereof, such as a mixture of three, four, five,six, seven, eight, nine, ten, or eleven. In specific aspects, theattached hairpin adaptor comprises a non-replicable region in its loop.The non-replicable region may be generated during the altering of theDNA molecule, for example. In other embodiments, the non-replicableregion comprises at least one abasic site, such as one that is generatedfrom deoxyuridines comprised within the 5′ stem and loop region of thefirst hairpin adaptor.

The altering step may occur in a solution that comprises a DNApolymerase, a ligase, and, optionally, a uracil-DNA glycosylase, whereinsaid solution is suitable for activity of said polymerase, ligase, and,optionally, a glycosylase. In other embodiments, the altering stepoccurs in a solution that comprises a DNA polymerase, ligase, optionallya uracil-DNA glycosylase, and optionally a mixture ofmethylation-specific restriction enzymes, wherein the solution issuitable for activity of said polymerase, ligase, optionally aglycosylase, and optionally restriction enzymes. In a specificembodiment, the 3′ end of the DNA molecules is extended from the nicksite up to a non-replicable region of the first adaptor.

In particular embodiments, amplifying comprises a first heating step tofragment abasic regions of the first adaptor-linked molecules. Themethod may further comprise the step wherein sodium bisulfate isprovided to the first adaptor-linked molecules, wherein the unmethylatedcytosines in the first adaptor-linked molecules are converted to uracil,thereby producing bisulfate-converted molecules. In a specificembodiment, the adaptor is further defined as comprising a 3′ stemregion, wherein the 3′ stem region does not comprise guanine and whereinall cytosines are methylated. The method may further comprise the stepof enriching for first-adaptor attached molecules comprising CpG-richregions, such as by heating. In further aspects of the invention, asubset of first-adaptor attached molecules is denatured.

In some aspects, the method further comprises the step of comparing atleast part of the sequence of the amplified adaptor-linked fragment witha control DNA molecule that was not subjected to the digestion step. Themethod may also further comprise digesting the amplified firstadaptor-linked fragments with at least one of the methylation-sensitiverestriction enzymes in the mixture; attaching a second adaptor to atleast one digested adaptor-linked fragment to produce a secondadaptor-linked fragment, wherein the 5′ end of the digested amplifiedDNA fragment is attached to the nonblocked 3′ end of the second adaptor,leaving a nick site between the juxtaposed 3′ end of the fragment and a5′ end of the second adaptor; extending the 3′ end of the digestedamplified DNA fragment from the nick site; and amplifying the secondadaptor-linked fragments with a primer complementary to at least part ofthe second adaptor to produce amplified second adaptor-linked fragments.

Analysis of the amplified second adaptor-linked fragments may beperformed to determine the methylation status of the provided DNA. Inspecific embodiments, the second adaptor comprises at least one end thatis complementary to the ends of the digested amplified DNA fragments.The second adaptor may comprise at least one blunt end and/or one ormore known sequences, such as those wherein the one or more knownsequences are substantially non-self complementary and substantiallynon-complementary to other second adaptors.

In specific embodiments, the DNA molecule is obtained from plasma,serum, or urine.

In other embodiments of the invention, there is a method of detecting acondition in an individual, comprising the steps of: (1) providing atleast one DNA molecule from the plasma, serum, or urine of theindividual; (2) altering the molecule in a single incubation, saidincubation comprising: (a) modifying the ends of the DNA molecules toprovide attachable ends; (b) repairing nicks and/or gaps within the DNAmolecule; (c) attaching a first hairpin adaptor comprising a stem, aknown sequence and a nonblocked 3′ end to the ends of the DNA moleculesto produce adaptor-linked molecules, wherein the 5′ end of the DNA isattached to the nonblocked 3′ end of the adaptor, leaving a nick sitebetween the juxtaposed 3′ end of the DNA molecule and a 5′ end of theadaptor; (d) extending the 3′ end of the DNA molecules from the nicksite; and (e) digesting the altered DNA molecules with a mixture ofmethylation-sensitive restriction enzymes that do not cleave within theattached first adaptor; and (3) amplifying the first adaptor-linked DNAmolecules with a primer complementary to at least a portion of the stemregion of the first adaptor to produce amplified first adaptor-linkedfragments.

In particular embodiments, the method further comprises analyzingamplified first adaptor-linked fragments that are representative of saidcondition, such as those that comprise a characteristic methylationstatus. In specific embodiments, the condition is cancer and theamplified adaptor-linked fragments comprise methylated promoter regions,such as, for example, regions that comprise at least one CpG islands.

In another embodiment, there is a method of identifying DNA regionsassociated with a condition, comprising the steps of: (1) obtaining atleast one DNA molecule from the plasma, serum, or urine of one or moreindividuals with the condition and one or more individuals without thecondition; (2) altering the molecule in a single incubation, saidincubation comprising: (a) modifying the ends of the DNA molecules toprovide attachable ends; (b) repairing nicks and/or gaps within the DNAmolecules; (c) attaching a first hairpin adaptor comprising a knownsequence, a stem, and a nonblocked 3′ end to the ends of the DNAmolecules to produce adaptor-linked molecules, wherein the 5′ end of theDNA is attached to the nonblocked 3′ end of the adaptor, leaving a nicksite between the juxtaposed 3′ end of the DNA molecule and a 5′ end ofthe adaptor; (d) extending the 3′ end of the DNA molecules from the nicksite; and (e) digesting with a mixture of methylation-sensitiverestriction enzymes that do not cleave within the attached firstadaptor; (3) amplifying the first adaptor-linked DNA molecules with aprimer complementary to at least a portion of the stem region of thefirst adaptor; and (4) identifying at least one specific amplified firstadaptor-linked fragment that is commonly produced from DNA fromindividuals with said condition but not from DNA from individualswithout said condition. The identifying step comprises sequencing,quantitative real-time polymerase chain reaction, ligation chainreaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification, microarray hybridization, or acombination thereof, in particular aspects.

In an additional aspect of the invention, there is a kit for singleincubation synthesis of a methylome library, said kit housed in asuitable container, comprising: a buffer suitable for activity of a DNApolymerase, ligase, uracil-DNA-glycosylase, and methylation-sensitiverestriction enzyme; and one or more of the following: a hairpin adaptor;a DNA polymerase; a ligase; uracil-DNA-glycosylase; at least onemethylation-sensitive restriction enzyme. In a specific embodiment, theadaptor is further defined as comprising at least one of the following:absence of a 5′ phosphate group; a non-blocked 3′ end; and deoxyuridinescomprised within the 5′ stem and loop region.

In another embodiment, there is a method of preparing a DNA molecule,comprising: (a) providing DNA resulting from apoptotic degradation; (b)digesting the DNA molecule with at least one methylation-sensitiverestriction enzyme; (c) incorporating a first adaptor having anonblocked 3′ end to produce first adaptor-linked fragments, wherein the5′ end of the digested fragment is attached to the nonblocked 3′ end ofthe adaptor, leaving a nick site between the juxtaposed 3′ end of thefragment and a 5′ end of the first adaptor, and extending the 3′ end ofthe fragment from the nick site; or (d) amplifying one or more of thefirst modified DNA molecules to provide amplified modified DNAmolecules.

BRIEF DESCRIPTION OF THE DRAWINGS

The following drawings form part of the present specification and areincluded to further demonstrate certain aspects of the presentinvention. The invention may be better understood by reference to one ormore of these drawings in combination with the detailed description ofspecific embodiments presented herein.

FIG. 1 illustrates a schematic presentation of whole genomeamplification by incorporating known sequence with self-inert degenerateprimers (see U.S. patent application Ser. No. 10/795,667, filed Mar. 8,2004, now U.S. Pat. No. 7,718,403 incorporated by reference herein inits entirety) followed by PCR amplification. Dashed lines representnewly synthesized strands. Thicker lines represent the known (universal)sequence.

FIG. 2 is a schematic presentation of design of exemplary self-inertdegenerate primers with reduced ability to form primer-dimers (see U.S.patent application Ser. No. 10/795,667, filed Mar. 8, 2004, now U.S.Pat. No. 7,718,403).

FIG. 3 shows a schematic description of the process of sodium bisulfiteconversion of DNA. DNA is treated with sodium bisulfite to chemicallyconvert cytosine to uracil. Methylated cytosines are resistant to thischemical reaction and thus are not converted to uracil. Exemplarysequences shown in FIG. 3 include GGGGCGGACCGCG (SEQ ID NO: 207),GGGGUGGAUUGUG (SEQ ID NO: 208), GGGGC^(m)GGACC^(m)GC^(m)G (SEQ ID NO:209) and GGGGC^(m)GGAUC^(m)GC^(m)G (SEQ ID NO: 210).

FIG. 4 depicts the principle steps in the reaction of chemicalconversion of cytosine to uracil by sodium bisulfite and alkalitreatment.

FIG. 5A-5C provide an analysis of self-priming and extension ofdegenerate YN-primers (primers containing from 0 to 6 completely randombases (N) at the 3′ end, 10 degenerate pyrimidine bases Y, and the knownpyrimidine sequence YU at the 5′ end (FIG. 2)). In FIG. 5A, YN primerscontaining 0, 1, 2 or 3 random N bases were used with or without dNTPs.In FIG. 5B, YN primers containing 0, 1, 2 or 3 random N bases and amodel template oligonucleotide (exemplary SEQ ID NO: 9) were used. InFIG. 5C, self-priming of YN-primers were tested. Note: Pyrimidine basesdo not stain with SYBR Gold.

FIG. 6 shows a comparison of whole genome amplification of DNA librariesprepared from 60 ng of bisulfite converted DNA or from 5 ng ofnon-converted DNA using the Klenow Exo⁻ fragment of DNA polymerase I anda combination of the self-inert degenerate primer R(N)₂ with thefacilitating primer R_(U)(A)₁₀(N)₂ (exemplary SEQ ID NO: 10 and 18) inthe first case, and with the self-inert degenerate primer K(N)₂(exemplary SEQ ID NO: 14) in the second case. The flat line represents ablank control without genomic DNA for the reaction with K(N)₂ primers.

FIG. 7A-7C show comparison between different self-inert degenerateprimer sequences supplemented with additional facilitating primers(added to facilitate priming of both strands of converted DNA) in theirability to support the library synthesis from bisulfite-converted DNAand subsequent efficient amplification by PCR. The identities of thedegenerate and facilitating primers used for each reaction are shown inthe top right corner of each panel. Experimental details are describedin Example 2 and primer sequences are listed in Table 1.

FIG. 8 demonstrates amplification of a genomic STS marker (STS sequenceRH93704, UniSTS database, National Center for Biotechnology Information)with primer pairs specific for non-converted DNA (exemplary SEQ ID NO:20 and 21) or specific for bisulfite-converted DNA (exemplary SEQ ID NO:22 and 23) by real-time PCR using 10 ng of DNA amplified frombisulfite-converted DNA with a combination of self-inert degenerateprimer R(N)₂ and facilitating primer R_(U)(A)₁₀(N)₂, or fromnon-converted DNA amplified with self-inert degenerate primer K(N)₂.

FIG. 9A-9C illustrate the optimization of the cleavage of human genomicDNA with McrBC nuclease. FIG. 9A demonstrates the effect of dilution ofMcrBC on library preparation and amplification. DNA was digested withMcrBC (0.02-0.10 U) for 1 h after which libraries were created andamplified. A control sample without McrBC cleavage was used forcomparison. Dilution of McrBC results in lowered cleavage rates and lessDNA molecules competent to form libraries. Digestion with higher amountsof McrBC does not result in earlier amplification of the resultinglibraries, suggesting that 0.1 U of McrBC produces maximal digestion.FIG. 9B is a plot of the amount of McrBC used to digest DNA versus thecycle number at 50% Max RFU during amplification. This result indicatesa sigmoidal relationship between the amount of McrBC and the effect onnumber of cycles necessary for amplification of the resulting libraries.FIG. 9C is a bar graph of the amount of McrBC (Units) versus % DNAdigested by McrBC. The % McrBC digested DNA was calculated by setting0.1 U McrBC as 100% and assuming a standard doubling reaction/PCR cycle.Therefore, each cycle shift to the left was converted to a 50% decreasein the digestion efficiency. The graph indicates that 50% digestionoccurs after digestion with 0.7 U McrBC for 1 hour.

FIG. 10 shows the distribution of fragments obtained after McrBCcleavage of human genomic DNA. Lane 1, molecular weight markers; Lane 2,undigested gDNA; Lane3, DNA digested with 10 units of McrBC at 37° C.;Lane 4, DNA digested with 10 units of McrBC at 37° C. but not heated at75° C. before loading; Lane 5, DNA digested with 10 units of McrBC at37° C. and treated with Taq polymerase in the presence of dNTPs tofill-in 3′ recessed ends; Lane 6, Lambda genomic DNA; Lane 7, DNAdigested with 5 units of McrBC at 37° C.; Lane 8, DNA digested with 2units of McrBC at 37° C.; Lane 9, DNA digested with 10 units of McrBC at16° C.; Lane 10, DNA digested with 10 units of McrBC at 25° C.; Lane 11,molecular weight markers.

FIG. 11 represents distribution plots of gel fractions obtained afterMcrBC cleavage of genomic DNA isolated from KG1-A leukemia cells orcontrol genomic DNA (Coriell repository #NA16028) followed by separationon agarose gel and elution of DNA from gel slices. Aliquots of eacheluted fraction were amplified by PCR using the following primers: p15promoter (SEQ ID NO: 24 forward and SEQ ID NO: 25 reverse), p16 promoter(SEQ ID NO: 26 forward and SEQ ID NO: 27 reverse), E-Cadherin promoter(SEQ ID NO: 28 forward and SEQ ID NO: 29 reverse) for sites internal toCpG islands, and p15 promoter (SEQ ID NO: 46 forward and SEQ ID NO: 47reverse), p16 promoter (SEQ ID NO: 48 forward and SEQ ID NO: 49reverse), or E-Cadherin promoter (SEQ ID NO: 52 forward and 53 reverse)for sites flanking the CpG islands, respectively. The following sizefractions were analyzed: 7.5-12 Kb, 4.5-7.5 Kb, 3.0-4.5 Kb, 2.0-3.0 Kb,1.5-2.0 Kb, 1.0-1.5 Kb, 0.65-1.0 Kb, 0.4-0.65 Kb, 0.25-0.4 Kb, and0.05-0.25 Kb. The size of the fractions was plotted against thereciprocal of the threshold amplification cycle for each real-time PCRcurve. Shown at the bottom of each panel are the PCR products separatedon agarose gel. Fractions follow the same order as on the respectivecurve plots.

FIG. 12 shows an ethidium bromide-stained gel of amplified products fromthe McrBC-mediated direct promoter methylation assay. After cleavage ofgenomic DNA from normal cells or from leukemia KG1-A cells with McrBCnuclease, sites internal to the CpG islands of p15 (GENBANK® AccessionNo. AF513858) p16 (GENBANK® Accession No. AF527803), E-Cadherin(GENBANK® Accession No. AC099314), or GSTP-1 (GENBANK® Accession No.M24485) promoters were amplified using specific PCR primers (SEQ ID NO:24+SEQ ID NO: 25, SEQ ID NO: 26+SEQ ID NO: 27, SEQ ID NO: 28+SEQ ID NO:29, and SEQ ID NO: 30+SEQ ID NO: 31, respectively). DNA fully methylatedwith SssI CpG methylase was used as a positive control. Cleavage betweenmethylated cytosines by McrBC results in lack of amplification andcorrelates with the methylation status of the promoters.

FIG. 13 shows an ethidium bromide stained gel of amplified products fromthe McrBC-mediated library promoter methylation assay based on theattachment of a modular adaptor to McrBC cleavage sites allowingone-sided PCR between the adaptor and specific sites flanking the CpGisland. In the first amplification step, a proximal T7 promoter sequenceis ligated and used to amplify all fragments, followed by incorporationof a 5′ tail comprising 10 cytosines. This distal sequence allowsasymmetric one-sided PCR amplification due to the strong suppressioneffect of the terminal poly-G/poly-C duplex. One-sided PCR was performedwith C₁₀ primer (SEQ ID NO: 38), and primers specific for the human p15promoter (SEQ ID NO: 39), p16 promoter (SEQ ID NO: 40), or E-Cadherinpromoter (SEQ ID NO: 41 and SEQ ID NO: 42).

FIG. 14 demonstrates the sensitivity limits of the library methylationassay described in Example 6 and FIG. 13. Different ratios of McrBClibraries prepared from normal or cancer cells were mixed and thenamplified with the universal C₁₀ primer (SEQ ID NO: 38) and a primerspecific for the p15 promoter 5′ flanking region (SEQ ID NO: 39). Thetotal amount of DNA was 50 ng per amplification reaction, containing 0,0.1, 1.0, 10, 50, or 100% of cancer DNA.

FIG. 15 shows an ethidium bromide-stained gel of amplified products fromthe McrBC-mediated library promoter methylation assay based on ligationof nick-attaching biotinylated adaptor to McrBC cleavage sites,propagation of the nick to a controlled distance, and immobilization ofthe nick-translation products on streptavidin beads. Aliquots of thestreptavidin beads containing immobilized nick-translation products fromnormal or cancer cells were used to amplify specific regions flankingpromoter CpG islands using primer pairs specific for the human p15promoter (SEQ ID NO: 46 forward and SEQ ID NO: 47 reverse), p16 promoter(SEQ ID NO: 48 forward and SEQ ID NO: 49 reverse), or E-Cadherinpromoter (SEQ ID NO: 50 forward and SEQ ID NO: 51 reverse).

FIG. 16 shows the products of amplification of a sequence flanking theCpG island of the p15 promoter in normal and cancer cells using DNAamplified with universal K_(U) primer (SEQ ID NO: 15) from immobilizednick-translation libraries described in FIG. 15. The products amplifiedwith primers specific for the human p15 promoter (SEQ ID NO: 46 forwardand SEQ ID NO: 47 reverse) are illustrated.

FIG. 17 shows the products of amplification of a sequence flanking theCpG island of the p15 promoter in normal and cancer cells using theMcrBC-mediated library promoter methylation assay based on extension of3′ recessed ends of McrBC cleavage sites in the presence of abiotin-containing nucleotide analog, followed by DNA fragmentation andimmobilization on streptavidin magnetic beads. Aliquots of thestreptavidin beads and a primer pair specific for a region flanking theCpG island of the human p15 promoter were used for PCR amplification(SEQ ID NO: 46 forward and SEQ ID NO: 47 reverse)

FIG. 18 shows the products of amplification of sequences flanking theCpG islands of p15, p16, and E-Cadherin promoters in normal and cancercells using DNA amplified with universal K_(U) primer (SEQ ID NO: 15)from immobilized fill-in libraries described in FIG. 17. The productsamplified with primers specific for the human p15 promoter (SEQ ID NO:46 forward and SEQ ID NO: 47 reverse), p16 promoter (SEQ ID NO: 48forward and SEQ ID NO: 49 reverse), or E-Cadherin promoter (SEQ ID NO:50 forward and SEQ ID NO: 51 reverse) are depicted.

FIG. 19A-19B illustrate the analysis of the nature of the ends producedby McrBC cleavage as well as direct ligation of adaptors with5′-overhangs to McrBC cleavage sites without any prior enzymatic repair.FIG. 19A shows the requirement for polishing in order to ligateblunt-ended adaptors and amplify McrBC digested DNA. Furthermore, theability of Klenow Exo- to polish McrBC-cleaved DNA as effectively asKlenow indicates that McrBC cleavage results in 5′ overhangs withcompetent 3′ ends. Omitting the polishing step results in amplificationsidentical to that of the no DNA negative control. FIG. 19B showsamplification of libraries prepared by McrBC cleavage after ligation ofan adaptor containing universal T7 promoter sequence and 5′ overhangscomprising from 0 to 6 completely random bases. The amplification ofnon-polished samples ligated to adaptors with 5 or 6 base overhangs wasidentical to the control polished sample ligated to blunt-end (0overhang) adaptor, indicating that the 5′ overhangs produced by McrBCcleavage are at least 6 bases long. Adaptor with overhangs shorter than5 bases were much less efficient. This result indicates that a minimumof 5 bases are required to support efficient hybridization andsubsequent ligation of adaptors to McrBC overhangs.

FIG. 20 illustrates library amplification aimed at determining theoptimal amount of T7 adaptor with a 6-base overhang for efficientligation to McrBC ends. Ligation of adaptor to 10 ng of McrBC digestedDNA was with 1000 units of T4 ligase and 0, 0.032, 0.064, 0.125, 0.25,0.5, or 1 μM final adaptor concentration.

FIG. 21 shows the amplification of a short sequence from the CpG islandof the p16 promoter in normal and cancer cells from libraries comprisingshort amplifiable DNA sequences generated by McrBC cleavage of 10 ng and50 ng of genomic DNA, ligation of universal adaptor T7-N6 (SEQ ID NO: 32and SEQ ID NO: 59), size fractionation through Microcon YM-100 membranefilter, and amplification with universal T7 primer (SEQ ID NO: 37).C=cancer DNA, N=normal DNA

FIG. 22 demonstrates amplification of libraries comprising shortamplifiable DNA sequences generated by McrBC cleavage of 10 ng, 1 ng, or0.1 ng of genomic DNA after ligation of universal adaptor T7-N6 (SEQ IDNO: 32 and SEQ ID NO: 59)

FIG. 23 illustrates amplification of short sequence from the CpG islandof the p16 promoter in normal and cancer cells from libraries comprisingshort amplifiable DNA sequences generated by McrBC cleavage of 10 ng, 1ng, or 0.1 ng of genomic DNA, ligation of universal adaptor T7-N6 (SEQID NO: 32 and SEQ ID NO: 59), amplification with universal T7 primer(shown in FIG. 22), size fractionation through Microcon YM-100 membranefilter, and re-amplification with universal T7 primer (SEQ ID NO: 37).The insert to FIG. 23 shows analysis of the short p16 amplicon on 1%agarose gel after staining with ethidium bromide. C=cancer DNA, N=normalDNA

FIG. 24 depicts amplification of short sequence from the CpG island ofthe p16 promoter in normal and cancer cells from 4 ng or 20 ng oflibraries comprising short amplifiable DNA sequences generated by McrBCcleavage of 10 ng or 50 ng, of genomic DNA, ligation of universaladaptors T7-N6 and GT-N6 (SEQ ID NO: 32 and SEQ ID NO: 59, and SEQ IDNO: 15 and SEQ ID NO: 60, respectively), and amplification withuniversal T7 and Ku primers (SEQ ID NO: 37 and SEQ ID NO: 15). C=cancerDNA, N=normal DNA.

FIG. 25 shows amplification of short sequence from the CpG island of thep16 promoter in normal and cancer cells from 4 ng of librariescomprising short amplifiable DNA sequences generated by cleavage of 10ng of genomic DNA with 0, 0.5, 1, 2, 5, or 10 units of McrBC, ligationof universal adaptors T7-N6 and GT-N6 (SEQ ID NO: 32 and SEQ ID NO: 59,and SEQ ID NO: 15 and SEQ ID NO: 60, respectively), and amplificationwith universal T7 and Ku primers (SEQ ID NO: 37 and SEQ ID NO: 15).C=cancer DNA, N=normal DNA.

FIG. 26 demonstrates preparation of a methylation specific library basedon cleavage using the methylation-sensitive restriction enzyme Not I.Briefly, genomic DNA is digested with Not I, randomly fragmented, andsubsequently converted to a Not I methylation-specific whole-genomelibrary. The resulting library is amplified using a T7-C₁₀ primer (SEQID NO: 36). The purified product of the first amplification issubsequently digested again with Not I and universal GT adaptors areligated to the resulting ends. Finally, only those sequences that had aGT adaptor ligated to them are amplified by PCR using K_(U) and C₁₀universal primers (SEQ ID NO: 15 and SEQ ID NO: 38, respectively).Sequences that contain the C₁₀ primer sequence at both ends of themolecule are unable to be amplified due to the characteristics of thistype of molecule (U.S. application Ser. No. 10/293,048, filed Nov. 13,2002, now U.S. Pat. No. 7,655,791, incorporated by reference herein inits entirety).

FIG. 27 illustrates the results of real-time PCR analysis of 14 markerscorresponding to sites adjacent to known Not I restriction sites. Bothcontrol and Not I-digested DNA samples were analyzed. All 14 sites weredetected in the control DNA, indicating that all sites wee efficientlycleaved and amplified when there is no methylation present. In contrast,only 7 of the 14 sites were detected in Not I-digested DNA, indicatingthat half of the 14 sites were methylated in the starting DNA.

FIG. 28 illustrates the results of real-time PCR analysis of 6 markerscorresponding to sites adjacent to known Not I restriction sites. Bothcontrol and Not I-digested DNA samples were analyzed. All 6 sites weredetected in genomic DNA, indicating that all sites were efficientlycleaved and amplified when there is no methylation present. In contrast,only 3 of the 6 sites were detected in Not I-digested DNA, indicatingthat half of the sites were methylated in the starting DNA.

FIG. 29A-29B depict two methods for library preparation andamplification of hypomethylated regions of DNA based on use of themethylation-specific endonuclease McrBC. In FIG. 29A, genomic DNA isdigested with the methylation-specific endonuclease McrBC.Hypermethylated regions are digested into pieces not suitable forlibrary generation. Following cleavage, DNA is randomly fragmented bychemical or mechanical means and is converted into libraries byattachment of universal adaptors as described in Example 15. Theresulting amplicons, specific to regions of hypomethylation, areamplified and can be analyzed by techniques such as PCR amplificationand microarray hybridization, for example. In FIG. 29B, a second methodof library preparation is illustrated wherein a poly C adaptor sequence(12-40 bp) is attached to polished ends following McrBC cleavage. Thepresence of the poly C sequence prevents amplification of DNA ampliconsfrom hypermethylated regions that contain the poly C sequence at bothends (US Patent Application 20030143599). Libraries are created,amplified and analyzed as in FIG. 29A.

FIG. 30 demonstrates a second method for the amplification ofhypomethylated regions of DNA through use of the methylation-specificendonuclease McrBC. Genomic DNA is randomly fragmented by mechanicalmeans, and the resulting products are polished to produce blunt ends.Following polishing, universal adaptors are ligated to both ends of themolecules resulting in generation of an amplifiable library. The libraryis digested with McrBC, which results in cleavage of all amplicons thatcontain 2 or more methylated cytosines. The intact amplicons within thelibrary are then amplified with the universal primer. The resultingproducts represent regions of hypomethylation within the genome and canbe analyzed by PCR amplification for specific sequences, or bygenome-wide hybridization for discovery and/or diagnostic purposes.

FIG. 31 demonstrates another method for the amplification ofhypomethylated regions of DNA through use of the methylation-specificendonuclease McrBC. Genomic DNA is randomly fragmented by chemical meansand the resulting single-stranded DNA fragments are converted intofragments with double-stranded blunt ends by a combination fill-in andpolishing reaction. Universal adaptors are attached to the ends of thefragments. Following ligation, a single cycle of PCR with a thermolabileDNA polymerase is performed with a universal oligo containing a singlemethyl group. The resulting amplicons are digested with themethylation-specific endonuclease McrBC that will result in cleavage ofall amplicons containing one or more methyl cytosines on the originalparent strand. After digestion, intact strands are amplified usinguniversal primers. The resulting products represent regions ofhypomethylation within the genome and can be analyzed by PCRamplification for specific sequences, or by genome-wide hybridizationfor discovery or diagnostic purposes. A methylated oligo is utilized forthe single amplification cycle if McrBC is only able to cleave moleculesthat have methyl groups in a trans orientation. An alternative methodutilizing a non-methylated oligo for the single PCR step can be used ifMcrBC is able to cleave molecules that are methylated only in a cisorientation.

FIG. 32 illustrates the structure of the various adaptor sequences usedin library preparation. Structures of the blunt-end, 5′ overhang, and 3′overhang adaptors used in the initial library construction are provided.Structure of the adaptor for ligation to Not I digested DNA thatcontains the Not I overhang is provided. Note that the ligation of thisadaptor will not result in a functional Not I cleavage site.

FIG. 33A depicts a method for detecting DNA methylation in cancer cellsusing methylation-sensitive restriction endonucleases and whole genomeamplification. DNA from cancer and normal cells is incubated in thepresence of a methylation-sensitive restriction endonuclease, such as,for example, Hpa II. This results in the cleavage of DNA from normalcells containing the Hpa II recognition sites, but not the DNA fromcancer cells that is methylated. Primary Methylome libraries areprepared and amplified resulting in all sequences amplified in thecancer cells, while the promoter sequences containing the Hpa IIrestriction sites in the normal cells are not amplified due to the factthat they are cleaved during the digestion step. Analysis of theresulting DNA products allows the determination of which samplescontained methylated restriction sites, as only those sites aredetectable.

FIGS. 33B-33C illustrate the method of synthesis of Methylome librarysimilar to that shown on FIG. 33A with the only major difference thatbeing instead of one enzyme a mix of multiple methylation-sensitiverestriction enzymes is used in one reaction to efficiently cleave allnon-methylated CpG-rich regions (islands) within the DNA. A nucleasecocktail converts such regions into very short DNA fragments that failto amplify efficiently by the implemented whole methylome amplification(WMA) method.

FIGS. 33D-33E illustrate similarity in distribution of the density ofCpG dinucleotides and restriction sites for more than one restrictionendonuclease, such as from the following, for example: 11 restrictionnucleases (Aci I, BstU I, Hha I, HinP1 I, Hpa II, Hpy 99I, Ava I, BceAI, Bsa HI, Bsi E1, and Hga I) that can be used in one reaction cocktailfor preparation of Methylome libraries.

FIG. 34 illustrates one exemplary method of analyzing the productsproduced in FIGS. 33A-33E. Specifically, quantitative real-time PCR isused with primer pairs that are within the region of interest (i.e., thepromoter sequence containing the restriction digest site). The shift inthe number of cycles necessary for amplification between normal cellsand cancer cells is an indication of methylation in the cancer sample.The products of a single Methylome library amplification can bedispensed into a 96 well plate, allowing the simultaneous determinationof the methylation status of 96 promoter regions at the same time.

FIG. 35 demonstrates the use of DNA array hybridization for the analysisof the products produced in FIGS. 33A-33E. Promoter sites of interestcan be spotted on an array and hybridized with amplified products fromnormal or cancer cells. Normal cells, which exhibit low levels ofmethylation, will have very few, if any, sites that can be detected. Incontrast, the detection of methylated promoters in cancer samples willresult in a strong hybridization signal. Control hybridizations, such asusing undigested genomic DNA, will validate the detection of allpromoter sites.

FIG. 36 illustrates the analysis of the average size of DNA fragmentsobtained after overnight digestion of genomic DNA with fourmethylation-sensitive restriction enzymes with 4-base recognition sitescontaining at least one CpG dinucleotide. Aliquots of 165 ng ofdigestion reactions are analyzed on 1% agarose gel after staining withSYBR Gold. Lanes 1 and 10 contain 1Kb Plus DNA ladder (Invitrogen);Lanes: 3, 5, 7, 9, 11, 13, and 14 are blank; Lanes: 2, 4, 6, 8, and 12are DNA digested with HinP1 I, HpaII, Aci I, BstUI, and non-digestedcontrol respectively.

FIGS. 37A-37C demonstrate the real-time PCR amplification of specificpromoter sequences from the CpG islands of the exemplary p15, p16, andE-Cadherin promoters in normal and cancer cells from libraries preparedby restriction digestion with BstU I (or control undigested DNA)followed by incorporation of universal sequence and subsequentamplification. The following exemplary primer pairs were used: p15promoter region - primer pair #1-p15 SF upstream (SEQ ID NO: 63) and p15SB downstream (SEQ ID NO: 64), primer pair #2-p15 Neg F upstream (SEQ IDNO: 24), and p15 Neg B downstream (SEQ ID NO: 25); p16 promoterregion-primer pair #1-p16 Nick F upstream (SEQ ID NO: 48) and p16 Nick Bdownstream (SEQ ID NO: 49), primer pair #2-p16 LF upstream (SEQ ID NO:65), and p16 LB downstream (SEQ ID NO: 66); E-Cadherin promoter region -primer pair #1-E-Cad Neg F upstream (SEQ ID NO: 28) and E-Cad Neg Bdownstream (SEQ ID NO: 29), and primer pair #2-E-Cad Neg F upstream (SEQID NO: 28), and E-Cad LB downstream (SEQ ID NO: 67). Four percent ofdimethyl sulfoxide (DMSO) was included in all steps of the protocol. Inaddition, 7-deaza-dGTP was added at a final concentration of 200 μM inthe library preparation (incorporation of universal sequence) step ofall samples as well as in the library amplification step of all samplesexcept the subset amplified with primer pair #1 of the p16 promoter(FIG. 37B). This set was supplemented with 0.5 M betaine instead. Thespecific sequence amplification of the p16 promoter with primer pair #2(FIG. 34B) and of the E-Cadherin promoter with both primer pairs (FIG.37C) was done in the presence of an additional 0.5 M betaine. Theexemplary PCR conditions are detailed in Example 20.

FIGS. 38A-38B show the amplification by real-time PCR of a specificpromoter sequence from the CpG island of the GSTP-1 gene of 3 clinicalisolates of prostate adenocarcinoma and from RWPE prostate cancer cellline in primary whole Methylome libraries prepared from controlundigested DNA (FIG. 38A), or from DNA digested with Aci I (FIG. 38B),followed by incorporation of universal sequence and subsequentamplification, as described in FIGS. 33A-33E. The primers were: GSTP-1Neg F upstream (SEQ ID NO: 30) and GSTP1 Neg B2 downstream (SEQ ID NO:68) amplifying a 200 bp promoter region. Details of the PCR conditionsare described in Example 21.

FIG. 39 illustrates preparation of a whole genome library by chemicalfragmentation using a non-strand displacing polymerase. Briefly, genomicDNA is fragmented chemically resulting in the production of singlestranded DNA fragments with blocked 3′ ends. A fill-in reaction with anon-strand displacing polymerase is performed. The resulting dsDNAfragments have blunt or several bp overhangs at each end and may containnicks of the newly synthesized DNA strand at the points where the 3′ endof an extension product meets the 5′ end of a distal extension product.Adaptor sequences are ligated to the 5′ ends of each side of the DNAfragment. Finally, an extension step is performed to displace the short3′ blocked adaptor and extend the DNA fragment across the ligatedadaptor sequence. This process results in only one competent strand foramplification if there are nicks present in the strand created duringthe fill-in reaction.

FIG. 40 represents an alternative model by which a whole genome libraryis prepared by chemical fragmentation using a strand-displacingpolymerase. Briefly, genomic DNA is fragmented chemically resulting inthe production of single stranded DNA fragments with blocked 3′ ends. Afill-in reaction with a strand displacing polymerase is performed. Theresulting DNA fragments have a branched structure resulting in thecreation of additional ends. All ends are either blunt or have severalbp overhangs. Adaptor sequences are ligated to the 5′ ends of each endof the DNA fragments. Finally, an extension step is performed todisplace the short 3′ blocked adaptor and extend the DNA fragment acrossthe ligated adaptor sequence. This process may result in multiplestrands of different sizes being competent to undergo subsequentamplification, depending on the amount of strand displacement thatoccurs. In the example depicted, the full-length parent strand and themost 3′ distal daughter strand are competent to undergo amplification.

FIG. 41 represents an alternative model by which a whole genome libraryis prepared by chemical fragmentation using a polymerase with nicktranslation ability. Briefly, genomic DNA is fragmented chemically,resulting in the production of single stranded DNA fragments withblocked 3′ ends. A fill-in reaction with a polymerase capable of nicktranslation is performed. The resulting ds DNA fragments have blunt orseveral bp overhangs at each end and the daughter strand is onecontinuous fragment. Adaptor sequences are ligated to the 5′ ends ofeach side of the DNA fragment. Finally, an extension step is performedto displace the short 3′ blocked adaptor and extend the DNA fragmentacross the ligated adaptor sequence. Both strands of the DNA fragmentare suitable for amplification due to the creation of a full-lengthdaughter strand by nick translation during the fill-in reaction.

FIG. 42 shows the structure of a specific adaptor and how it is ligatedto blunt-ended double stranded DNA fragments, the resulting dsDNAfragments, and the extension step following ligation used to fill in theadaptor sequence and displace the blocked short adaptor.

FIG. 43A illustrates a method for the preparation and analysis of asecondary Methylome library from a primary Methylome library prepared byusing only one methylation sensitive restriction enzyme (Hpa II).Briefly, amplicons from a primary Methylome library are digested withthe same restriction endonuclease utilized in the creation of theprimary library. A mixture of two adaptors (A and B) is ligated to theresulting cleaved ends to create the secondary Methylome library. PCR isthen performed to amplify only those molecules that have adaptors A andB on either end. These amplified products are highly enriched formethylated promoter sequences and can be analyzed by microarrayhybridization, PCR, capillary electrophoresis, or other methods known inthe art.

FIG. 43B illustrates a method for the preparation and analysis of asecondary Methylome library from a primary Methylome library prepared byusing a restriction enzyme cocktail of 5 or more methylation-sensitiverestriction enzymes. Briefly, DNA aliquots from a primary Methylomelibrary are digested separately with the restriction endonucleases R₁,R₂, R₃, . . . , R_(N) utilized in the synthesis of the primary library.Products of digestion are combined together and a mixture of twoadaptors (A and B) is ligated to the resulting cleaved ends to createthe secondary Methylome library. PCR is then performed to amplify onlythose molecules that have adaptors A and B on either end. Theseamplified products are highly enriched for methylated promoter sequencesand can be analyzed by microarray hybridization, PCR, capillaryelectrophoresis, or other methods known in the art. It should also benoted that within this library the same genomic region is usuallyrepresented by many different restrtiction fragments, thus creating aredundancy and improved representation that is critical for manydownstream applications of secondary library including microarrayanalysis, for example.

FIG. 44 is a depiction of how capillary electrophoresis can be utilizedfor analysis of secondary Methylome libraries. The complexity (N) of thesecondary Methylome library is a function of the number of methylatedCpG islands in the genome (n), and the average number of times aspecific restriction endonuclease occurs in the CpG islands (m). Anexample is illustrated where 1% of CpG islands are methylated and theHpa II restriction site is present 5 times/CpG island. This results inapproximately 1,200 restriction fragments within the secondary library.Re-amplification of this library using 16 combinations of A and B oligoswith a single 3′ selecting nucleotide would result in approximately 75specific sequences/well. This level of complexity can be analyzed bycapillary electrophoresis, allowing determination of the patterns ofmethylation in different samples without a priori knowledge of which CpGislands are important. Sequencing of the resulting products would allowthe determination of the CpG islands that were methylated in theoriginal sample.

FIGS. 45A-45C demonstrate a method for the synthesis and amplificationof methylation specific libraries from the exemplary serum and plasmaDNA. The small size (200 bp-3 kb) of DNA extracted from serum and plasmaallows the direct attachment of adaptors to these molecules (U.S. patentapplication Ser. No. 10/797,333, filed Mar. 8, 2004, published as U.S.Patent Application Publication No.: 2004/0209299 and is now abandonedincorporated by reference herein in its entirety). Digestion of theresulting library with a methylation-sensitive restriction endonuclease(FIG. 45A) results in cleavage of all molecules that contain anunmethylated restriction site. PCR amplification following digestionresults in amplification of those molecules containing a methylatedrestriction site (resistant to cleavage), as well as molecules that donot contain the restriction site. The digested molecules that containedan unmethylated restriction site will not be able to serve as a templateduring PCR with universal primer. Digestion of the resulting librarywith a mixture of multiple restriction enzymes (such as, for example, 5or more) (FIGS. 45B-45C) yields increased cleavage efficiency ofmolecules that contain several unmethylated CpG sites that coincide withrestriction sites. The density of such restriction sites within theCpG-rich promoter regions is extremely high (see FIGS. 33D-33E) and canexceed 50 sites per 100 base pairs. PCR library amplification followingdigestion results in amplification of only those molecules that containmethylated restriction sites, as well as molecules that do not containthe restriction site. The digested molecules that contained anunmethylated restriction site or especially a group of unmethylatedrestriction sites will not survive the cleavage step in tact and willnot serve as a template during amplification. The resulting products canbe analyzed by PCR, microarray hybridization, probe assay, or othermethods known to the art, for example. Positive detection of signalindicates methylation in the starting sample.

FIG. 46 illustrates a method for the synthesis and amplification of asecondary methylation library from serum and plasma DNA. The primarylibrary is created in the same manner as illustrated in Example 24 andFIGS. 45A-45B. Following amplification of the primary library with aprimer containing a 5′ C₁₀ sequence (SEQ ID NO: 38), all methylatedsites from the original library are converted to unmethylated sites.These sites can then be digested with the same restrictionendonuclease(s) utilized in the generation of the primary methylationlibrary (see FIGS. 43A-43B). Ligation of a second adaptor to the ends ofthe resulting cleavage fragments generates the secondary library.Amplification of this library with the C₁₀ oligo (SEQ ID NO: 38) and thesecond adaptor results in amplification of those molecules thatcontained a methylated restriction site in the original material. Themolecules that did not contain a restriction site are not digested,ligated, or amplified in the secondary library. This results inenrichment of the specific methylated sequences in the secondarylibrary, resulting in improved analysis of the amplification products.

FIG. 47 demonstrates a method for generating a methylation-specificlibrary from serum and plasma DNA using the methylation specificendonuclease McrBC. The small size (200 bp-3 kb) of DNA extracted fromserum and plasma allows the direct attachment of adaptors to thesemolecules, such as adaptors containing a C₁₀ sequence (SEQ ID NO: 38).Digestion of the resulting library with the methylation-specificrestriction endonuclease McrBC results in cleavage between twomethylated CpGs. Any molecules that contain less than 2 methylated CpGsare not digested. A second adaptor sequence can be ligated to theresulting cleaved fragments. Amplification of the resulting library witha C₁₀ oligo (SEQ ID NO: 38) and a primer complementary to the secondadaptor results in amplification of only those fragments containing thesecond adaptor on one or both ends. Amplicons that were not cleaved byMcrBC are not amplified due to the presence of the C₁₀ sequence (SEQ IDNO: 38) at both ends. The resulting amplified products can be assayed bymicroarray hybridization, PCR, probe assay, or other methods known inthe art, for example, in order to determine which sequences weremethylated in the original starting material.

FIG. 48 illustrates exemplary adaptor sequences utilized duringligation. Optimal ligation can be obtained using the 5′ T7N adaptorsN2T7 and N5 T7 combined with the 3′ T7N adaptors T7N2 and T7N5. However,it should be observed that acceptable results are obtained with avariety of combinations of adaptors as long as at least one adaptorcontaining a 5′ N overhang and one adaptor containing a 3′ N overhangare utilized together.

FIG. 49 depicts a method for preparation and amplification of wholegenome libraries prior to bisulfite conversion. In this method, genomicDNA is randomly fragmented and adaptors are subsequently attached to theends of the DNA fragments. These adaptors are resistant to bisulfiteconversion and will maintain their sequence following bisulfitetreatment (See FIG. 50). The DNA library undergoes bisulfite conversionand the products of this conversion are amplified using primerscomplementary to the adaptor sequence.

FIG. 50 illustrates attachment of two types of adaptor sequences thatare resistant to bisulfite conversion used in FIG. 49. In the firstcase, oligo 1 does not contain any cytosines and is therefore resistantto conversion. Following attachement of the adaptor, the ends of themolecules are extended in the presence of dTTP, dATP, and dmCTP, but notdCTP or dGTP. Therefore, the filled in ends only contain methylatedcytosines resistant to bisulfite conversion. In the second case, oligo 1contains methylated cytosine, but no guanine. Thus, oligo 1 is resistantto bisulfite cleavage. Extension of the 3′ ends of the molecules occursin the presence of dGTP, dATP, and dTTP, but not dCTP. Thus, thefilled-in ends do not contain any cytosines and they are not affected bybisulfite conversion. In both cases, a primer complementary to theadaptor sequence can be utilized without concern for the effects ofbisulfite conversion.

FIG. 51 depicts a comparison of the results of amplification of DNAwherein a single methylated site is not cleaved by amethylation-sensitive restriction endonuclease versus a singleunmethylated site that is cleaved by a methylation-sensitive restrictionendonuclease. In this example, the methylated site is amplified in theWGA libraries and can be detected by methods sensitive to the presenceof both the site of interest and the surrounding sequences. In contrast,the non-methylated site is not incorporated into the library preparationor amplification steps and is not detectable by methods sensitive to thepresence of the site of interest. Furthermore, the nature of the librarysynthesis reaction will produce, in the majority of instances, theexclusion of the sequences surrounding the site of interest. This gap isdue to the random nature of the priming reaction used during librarysynthesis and the statistical improbability of priming directly adjacentto the site of cleavage.

FIG. 52 illustrates the effect of pre-heating of genomic DNA on theefficiency of cleavage by the Aci I restriction enzyme. Preheating at85° C. results in improved efficiency of cleavage.

FIG. 53 shows exemplary amplification of completely methylated,partially methylated, and non-methylated promoter sites in KG1-A cellline for the human TIG-1, MGMT, and BRCA-1 genes respectively.

FIG. 54A shows analysis of DNA samples isolated from serum and urine bygel electrophoresis on 1.5% agarose. A typical banding patterncharacteristic of apoptotic nucleosomal DNA is observed.

FIG. 54B shows analysis of DNA from libraries prepared from urine by gelelectrophoresis on 1.5% agarose.

FIGS. 55-56 show typical amplification curves of promoter sites forgenes implicated in cancer from libraries derived from serum and urineDNA, respectively, for cancer patients and normal healthy controls.

FIG. 57 shows a comparison between libraries prepared with the singletube method to that of a two-step protocol. Digested samples from thesingle tube protocol had a greatly reduced background as compared to thetwo-step protocol. This results in significant improvement of thedynamic range and the throughput of the assay.

FIG. 58 shows a titration of the amount of methylated DNA in thebackground of bulk non-methylated DNA. As little as 0.01% of methylatedDNA can be reliably detected in the background of 99.99% ofnon-methylated DNA.

FIG. 59 shows a comparison between Klenow fragment of DNA polymerase Iand T4 DNA polymerase for their ability to preserve the methylationsignature of CpG islands during preparation of libraries for methylationanalysis. When artificially methylated urine DNA was treated with Klenowfragment of DNA polymerase I prior to restriction cleavage a delay tothreshold cycle (Ct) of 2 to 3 cycles was observed in the resultinglibraries suggesting that a significant fraction (estimated 75% to 90=%)of methyl-cytosine containing fragments are lost during the Klenowenzymatic repair process. In contrast, when T4 polymerase was used forrepair, the Ct shift is only one cycle or less depending on the siteanalyzed. This suggests that 50% or more of the methyl-cytosine waspreserved when T4 DNA polymerase is used.

FIG. 60A shows real-time PCR amplification curves for a range of inputDNA from libraries of bisulfite converted and non-converted DNA.

FIG. 60B shows real-time PCR curves from DNA chemically converted bysodium bisulfite and non-converted DNA using primers that are specificfor converted DNA and do not contain CpG dinucleotides in theirsequence.

FIGS. 61A-61C illustrate the complex effects of pre-heating to varioustemperatures of Alu I restriction fragments prior to preparation ofmethylome libraries by ligation of universal adaptor on the relativepresence of promoter sequences. Promoter sequences of high,intermediate, or low GC content are analyzed by quantitative PCR asexemplified by the GSTP-1 (FIG. 61A), MDR-1 (FIG. 61B), and APC (FIG.61C) promoters respectively. Differential enrichment of libraryfragments based on their GC content is demonstrated.

FIGS. 62A-62B show that methylome libraries prepared from cell-freeurine DNA by ligation of universal adaptor can be enriched for promotersequences by pre-heating prior to library preparation at temperaturesthat will selectively denature the fraction of DNA having low average GCcontent making it incompetent for ligation. Maximal enrichment ofpromoter sites is achieved by pre-heating at 89° C. to 91° C.

FIG. 63 shows PCR amplification curves of specific promoter sites fromamplified libraries prepared from cell-free urine DNA by ligation of adegradable hairpin adaptor containing deoxy-uridine with or withoutsubsequent cleavage with methylation-sensitive restriction enzymes.Promoter sites from non-methylated cleaved DNA amplify with significant(at least 10 cycles) delay as compared to uncut DNA for all fourpromoter sites tested. Methylated DNA is refractory to cleavage.

FIG. 64 shows PCR amplification curves of specific promoter sites fromamplified libraries prepared from cell-free urine DNA by ligation of adegradable hairpin adaptor containing deoxy-uridine with or withoutsimultaneous cleavage with methylation-sensitive restriction enzymes.Promoter sites from non-methylated cleaved DNA amplify with significant(at least 10 cycles) delay as compared to uncut DNA unlike methylatedDNA which is refractory to cleavage.

FIG. 65 shows the threshold cycle (Ct) difference between cut and uncutmixtures of LNCaP prostate cancer DNA and normal non-methylated DNAcalculated from real time PCR curves for three primer pairs amplifyingpromoter sites in methylome libraries prepared by incorporation ofuniversal sequence by self-inert primers. Detection sensitivity of atleast 99% is evident.

FIG. 66 shows PCR amplification curves of four promoter sites fromsecondary methylome libraries prepared from LNCaP prostate cancer cellline compared to control fragmented genomic DNA. Methylated promotersare enriched between 16-fold and 128-fold relative to non-amplifiedgenomic DNA, whereas no amplification is detected for the non-methylatedp16 promoter (the amplification curve from the methylome library forthis promoter corresponds to a false product).

FIG. 67 shows amplification curves for two promoter sites from crudecell-free urine DNA as compared to non-amplified methylome libraryprepared from cell-free urine DNA with or without cleavage withmethylation-sensitive restriction enzymes. Significant improvement ofboth PCR amplifiability and cleavage with restriction enzymes isobserved after enzymatic processing of urine DNA during one-stepmethylome library preparation.

FIGS. 68A-68E show a diagram illustrating different methods of preparingMethylome libraries that do not involve bisulfate conversion. FIG. 68Ashows DNA cleavage with multiple methylation-sensitive restrictionenzymes followed by the library synthesis, amplification and analysis.FIG. 68B shows DNA cleavage with multiple methylation-sensitiverestriction enzymes occurring after library synthesis, and then followedby amplification and analysis; FIG. 68C illustrates the possibility ofutilizing an alternative whole genome amplification method,specifically, the multiple strand displacement WGA technique that doesnot require a library synthesis step, such that DNA can be amplifieddirectly after the cleavage with multiple methylation-sensitiverestriction enzymes; FIG. 68D shows preparation of the Methylome libraryusing a single-step multiplex enzymatic approach that utilizes a hairpinoligonucleotide with a special base composition and a mixture of 9enzymes (see FIG. 73 and FIG. 74); FIG. 68E describes an envisionedprocess that combines the Methylome library preparation (as described inFIG. 68D) and isothermal amplification (for example, by transcriptionusing T7 RNA polymerase, assuming that the hairpin oligonucleotidecontains a T7 promoter sequence that become attached to DNA ends duringthe reaction) into a single-reaction multiplex enzymatic process.

FIGS. 69A-69D show a diagram illustrating different methods of preparingMethylome library that involves bisulfite conversion (versions of thethermo-enrichment Methylome library methods including those depicted onFIG. 69A) DNA cleavage with multiple methylation-sensitive restrictionenzymes followed by the library synthesis, bisulfite conversion,amplification and analysis is provided. FIG. 69B shows that DNA cleavagewith multiple methylation-sensitive restriction enzymes occurs after thelibrary synthesis, and is then followed by bisulfite conversion,amplification and analysis; (FIG. 69B′) WGA library synthesis usingligation and adaptors directly followed by bisulfite conversion,amplification and analysis. In FIG. 69C, Methylome library synthesisoccurs in one step using a hairpin oligo-adaptor and mix of all enzymesinvolved in the library synthesis, and then followed by bisulfiteconversion, amplification and analysis. In FIG. 69D, DNA bisulfiteconvertion is followed by degenerate primer-mediated whole genomeamplification (DP-WGA)

FIG. 70A illustrates a principle of the Methylome librarythermo-enrichment method that utilizes a heating-selection step afterthe DNA end “polishing” step but prior to the adaptor ligation reaction.Only GC-rich DNA fragments would retain double stranded structure uponheating and remain competent for ligation to the blunt end adaptor.

FIG. 70B illustrates a principle of the Methylome librarythermo-enrichment method that utilizes a heating-selection step afterthe adaptor ligation step (when only 5′ DNA ends become covalentlyattached to the adaptor) but prior to “fill-in” polymerization step thatcompletes formation of the Methylome library amplicons. Only GC-rich DNAfragments would retain double stranded structure upon heating and remaincompetent for the “fill-in” polymerization reaction.

FIGS. 71A-71B. FIG. 71A shows a base composition distribution of thehuman genome with a peak at 42% GC. FIG. 71B shows expected kinetics ofstrand dissociation for double-stranded DNA molecules with differentbase composition at pre-melting conditions.

FIGS. 72A-72D illustrate and compare several envisioned versions of thethermo-enrichment Methylome library method including those depicted onFIGS. 70A-70B. In FIG. 72A, blunt-end DNA after restrition enzymecleavage is heated, and the GC-rich DNA fraction is selected by theligation process. In FIG. 72B, degraded DNA is “polished” byproofreading DNA polymerase, heated and the GC-rich DNA fraction isselected by the ligation process. In FIG. 72C, degraded DNA is“polished” by proofreading DNA polymerase, ligated by its 5′ end to theadaptor, heated and the GC-rich DNA fraction is selected by the“fill-in’synthesis process. In FIG. 72D, degraded DNA is converted intoMethylome library, amplified using primer with 5′ phosphate group,heated and the GC-rich DNA fraction is selected by the ligation of newadaptor(s) and re-amplification.

FIG. 73 illustrates the principle of the one-step Methylome librarysynthesis method that involves a hairpin oligonucleotide adaptor andprovides the exemplary reactions including end polishing, hairpinoligonucleotide processing, oligonucleotide ligation, “fill-in” DNA endsynthesis, and cleavage with multiple methylation-sensitive restrictionenzymes to occur simultaneously in one complex reaction mix.

FIG. 74 shows the ligation and structural modification of the dU-HairpinOligonucleotide adaptor during a “One-step” methylome synthesisreaction. For simplicity a single DNA fragment end is shown acceptingthe adaptor through blunt-end ligation of the 3′ end of the adaptor tothe 5′ end of the DNA fragment. Simultaneously, dUTP bases are cleavedby Uracil DNA glycosylase to abasic sites. Once ligated the 3′ end ofthe DNA fragment is extended by DNA polymerase activity displacing thehairpin sequence and extending into the hairpin adaptor up to the firsttemplate abasic site. In addition to showing a portion of SEQ ID NO: 206(TGTGTTGGGUGUGTGTG), FIG. 74 also shows the same sequence replaced withT residues in place of the U residues (TGTGTTGGGTGTGTGTG; SEQ ID NO:211).

DETAILED DESCRIPTION OF THE INVENTION

In keeping with long-standing patent law convention, the words “a” and“an” when used in the present specification in concert with the wordcomprising, including the claims, denote “one or more.”

The practice of the present invention will employ, unless otherwiseindicated, conventional techniques of molecular biology, microbiology,recombinant DNA, and so forth which are within the skill of the art.Such techniques are explained fully in the literature. See e.g.,Sambrook, Fritsch, and Maniatis, MOLECULAR CLONING: A LABORATORY MANUAL,Second Edition (1989), OLIGONUCLEOTIDE SYNTHESIS (M. J. Gait Ed., 1984),ANIMAL CELL CULTURE (R. I. Freshney, Ed., 1987), the series METHODS INENZYMOLOGY (Academic Press, Inc.); GENE TRANSFER VECTORS FOR MAMMALIANCELLS (J. M. Miller and M. P. Calos eds. 1987), HANDBOOK OF EXPERIMENTALIMMUNOLOGY, (D. M. Weir and C. C. Blackwell, Eds.), CURRENT PROTOCOLS INMOLECULAR BIOLOGY (F. M. Ausubel, R. Brent, R. E. Kingston, D. D. Moore,J. G. Siedman, J. A. Smith, and K. Struhl, eds., 1987), CURRENTPROTOCOLS IN IMMUNOLOGY (J. E. coligan, A. M. Kruisbeek, D. H.Margulies, E. M. Shevach and W. Strober, eds., 1991); ANNUAL REVIEW OFIMMUNOLOGY; as well as monographs in journals such as ADVANCES INIMMUNOLOGY. All patents, patent applications, and publications mentionedherein, both supra and infra, are hereby incorporated herein byreference.

The present application is related to the subject matter of U.S. patentapplication Ser. No. 10/293,048, filed Nov. 13, 2002, now U.S. Pat. No.7,655,791; U.S. patent application Ser. No. 10/797,333, filed Mar. 8,2004, published as U.S. Patent Application Publication No.: 2004/0209299and is now abandoned; U.S. patent application Ser. No. 10/795,667, filedMar. 8, 2004, now U.S. Pat. No. 7,718,403 all of which are incorporatedby reference herein in their entirety.

I. DEFINITIONS

The term “attachable ends” as used herein refers to DNA ends that arepreferably blunt ends or comprise short overhangs on the order of about1 to about 3 nucleotides, in which an adaptor is able to be attachedthereto. A skilled artisan recognizes that the term “attachable ends”comprises ends that are ligatable, such as with ligase, or that are ableto have an adaptor attached by non-ligase means, such as by chemicalattachment.

The term “base analog” as used herein refers to a compound similar toone of the four DNA nitrogenous bases (adenine, cytosine, guanine,thymine, and uracil) but having a different composition and, as aresult, different pairing properties. For example, 5-bromouracil is ananalog of thymine but sometimes pairs with guanine, and 2-aminopurine isan analog of adenine but sometimes pairs with cytosine. Another analog,nitroindole, is used as a “universal” base that pairs with all otherbases.

The term “backbone analog” as used herein refers to a compound whereinthe deoxyribose phosphate backbone of DNA has been modified. Themodifications can be made in a number of ways to change nucleasestability or cell membrane permeability of the modified DNA. Forexample, peptide nucleic acid (PNA) is a new DNA derivative with anamide backbone instead of a deoxyribose phosphate backbone. Otherexamples in the art include methylphosphonates, for example.

The term “bisulfite-converted DNA” as used herein refers to DNA that hasbeen subjected to sodium bisulfite such that at least some of theunmethylated cytosines in the DNA are converted to uracil.

The term “blocked 3′ end” as used herein is defined as a 3′ end of DNAlacking a hydroxyl group.

The term “blunt end” as used herein refers to the end of a dsDNAmolecule having 5′ and 3′ ends, wherein the 5′ and 3′ ends terminate atthe same nucleotide position. Thus, the blunt end comprises no 5′ or 3′overhang.

The term “polished” as used herein refers to the repair of dsDNAfragment termini which may be enzymatically repaired, wherein the repairconstitutes the fill in of recessed 3′ ends or the exonuclease activitytrimming back of 5′ ends to form a “blunt end” compatible with adapterligation.

The term “CpG island” as used herein is defined as an area of DNA thatis enriched in CG dinucleotide sequences (cytosine and guaninenucleotide bases) compared to the average distribution within thegenome. The generally accepted CpG island constitutes a region of atleast 200-bp of DNA with a G+C content of at least 50% and observedCpG/expected CpG ratio of least 0.6.

The term “DNA immortalization” as used herein is defined as theconversion of a mixture of DNA molecules into a form that allowsrepetitive, unlimited amplification without loss of representationand/or without size reduction. In a specific embodiment, the mixture ofDNA molecules is comprised of multiple DNA sequences.

The term “fill-in reaction” as used herein refers to a DNA synthesisreaction that is initiated at a 3′ hydroxyl DNA end and leads to afilling in of the complementary strand. The synthesis reaction comprisesat least one polymerase and dNTPs (dATP, dGTP, dCTP and dTTP). In aspecific embodiment, the reaction comprises a thermostable DNApolymerase.

The term “genome” as used herein is defined as the collective gene setcarried by an individual, cell, or organelle.

The term “hairpin” as used herein refers to a structure formed by anoligonucleotide comprised of 5′ and 3′ terminal regions that areinverted repeats and a non-self-complementary central region, whereinthe self-complementary inverted repeats form a double-stranded stem andthe non-self-complementary central region forms a single-stranded loop.

The term “methylation-sensitive restriction endonuclease” as used hereinrefers to a restriction endonuclease that is unable to cut DNA that hasat least one methylated cytosine present in the recognition site. Askilled artisan recognizes that the term “restriction endonuclease” maybe used interchangeably in the art with the term “restriction enzyme.”

The term “methylation-specific restriction endonuclease” as used hereinregards an enzyme that cleaves DNA comprising at least onemethylcytosine on at least one strand. In a specific embodiment, theMcrBC enzyme is utilized and will not cleave unmethylated DNA. A skilledartisan recognizes that the term “restriction endonuclease” may be usedinterchangeably in the art with the term “restriction enzyme.”

The term “Methylome” as used herein is defined as the collective set ofgenomic fragments comprising methylated cytosines, or alternatively, aset of genomic fragments that comprise methylated cytosines in theoriginal template DNA.

The term “non-replicable organic chain” as used herein is defined as anylink between bases that can not be used as a template forpolymerization, and, in specific embodiments, arrests apolymerization/extension process.

The term “non-replicable region” as used herein is defined as any regionof an oligonucleotide that can not be used as a template forpolymerization, and, in specific embodiments, arrests apolymerization/extension process.

The term “non strand-displacing polymerase” as used herein is defined asa polymerase that extends until it is stopped by the presence of, forexample, a downstream primer. In a specific embodiment, the polymeraselacks 5′-3′ exonuclease activity.

The term “promoter” as used herein refers to a sequence that regulatesthe transcription of a particular nucleic acid sequence, which may bereferred to as a polynucleotide that encodes a gene product.

The term “random fragmentation” as used herein refers to thefragmentation of a DNA molecule in a non-ordered fashion, such asirrespective of the sequence identity or position of the nucleotidecomprising and/or surrounding the break.

The term “random primers” as used herein refers to shortoligonucleotides used to prime polymerization comprised of nucleotides,at least the majority of which can be any nucleotide, such as A, C, G,or T.

The term “replication stop” as used herein is defined as any region ofan oligonucleotide (which may be comprised as or in an adaptor) that cannot be used as a template for polymerization, and, in specificembodiments, arrests a polymerization/extension process.

The term “strand-displacing polymerase” as used herein is defined as apolymerase that will displace downstream fragments as it extends. In aspecific embodiment, the polymerase comprises 5′-3′ exonucleaseactivity.

The term “thermophilic DNA polymerase”, as used herein refers to aheat-stable DNA polymerase.

A skilled artisan recognizes that there is a conventional single lettercode in the art to represent a selection of nucleotides for a particularnucleotide site. For example, R refers to A or G; Y refers to C or T; Mrefers to A or C; K refers to G or T; S refers to C or G; W refers to Aor T; H refers to A or C or T; B refers to C or G or T; V refers to A orC or G; D refers to A or G or T; and N refers to A or C or G or T. Thus,a YN primer comprises at least one, and preferably more, series ofdinucleotide sets each comprising a C or a T at the first position andan A, C, G, or T at the second position. These dinucleotide sets may berepeated in the primer (or adaptor).

II. THE PRESENT INVENTION

A. Amplification of Sodium Bisulfite Converted DNA by Incorporation ofUniversal Known Sequence with Self-Inert Degenerate Primers

In embodiments of the present invention, there is whole genomeamplification of DNA comprising incorporation of known sequence followedby a subsequent PCR amplification step using the known sequence. In aspecific embodiment, the primers for incorporating the known sequencecomprise a degenerate region, and in further specific embodiments, theknown sequence and the degenerate region comprise anon-self-complementary nucleic acid sequence. Thus, there is significantreduction in self-hybridization and intermolecular primer hybridizationcompared to primers containing self-complementary sequences. Foramplification of sodium bisulfate-converted DNA, a degenerate primer ismixed with a primer comprising the same known sequence as the degenerateprimer but having a homo-polymeric region instead of a degenerateregion, herein referred to as a “facilitating primer”, and in furtherspecific embodiments, the known sequence and the homo-polymeric regioncomprise a non-self-complementary nucleic acid sequence. Since sodiumbisulfate-converted DNA has a modified base composition and is enrichedin adenine and uracil, the homopolymeric region of the primerselectively targeting converted DNA strands comprise either T or A.

Formation of primer dimers is a common problem in existing methods forDNA amplification using random primers. Due to the high complexity ofthe random primers, and in order to achieve efficient priming for eachindividual sequence, they have to be applied at very highconcentrations. Thus, the efficiency of annealing to a target DNAtemplate is greatly reduced due to the formation of primer-dimers.

Other problems known in the art when using random primers to amplify DNAare an inability to amplify the genome in its entirety due to locusdropout (loss), generation of short amplification products, and in somecases, the inability to amplify degraded or artificially fragmented DNA.

In specific embodiments, the invention utilizes an oligonucleotideprimer comprising, at least as the majority of its sequence, only twotypes of nucleotide bases that are not able to participate in stableWatson-Crick pairing with each other, and thus can not self-prime (seeU.S. patent application Ser. No. 10/795,667, filed Mar. 8, 2004, nowU.S. Pat. No. 7,718,403, for example). The primers comprise a constantknown sequence at their 5′ end and a degenerate nucleotide sequencelocated 3′ to the constant known sequence. There are four possibletwo-base combinations known not to participate in Watson-Crick basepairing: C-T, G-A, A-C and G-T. They suggest four different types ofdegenerate primers that should not form a single Watson-Crick base pairand should not create primer-dimers in the presence of a DNA polymeraseand dNTPs. These primers are illustrated in FIG. 2 and are referred toas primers Y, R, M and K, respectively, in accordance with commonnomenclature for degenerate nucleotides: Y═C or T, R═G or A, M═A or Cand K═G or T.

For example, Y-primers have a 5′ known sequence YU comprised of C and Tbases and a degenerate region (Y)10 at the 3 prime end comprising ten,for example, randomly selected pyrimidine bases C and T. R-primers havea 5′ known sequence RU comprised of G and A bases and a degenerateregion (R)10 at the 3 prime end comprising ten, for example, randomlyselected purine bases G and A. M-primers have a 5′ known sequence MUcomprised of A and C bases and a degenerate region (M)10 at the 3 primeend comprising ten, for example, randomly selected bases A and C.Finally, K-primers have a 5′ known sequence KU comprised of G and Tbases and a degenerate region (K)10 at the 3 prime end comprising ten,for example, randomly selected bases G and T. Primers of the describeddesign will not self-prime and thus will not form primer dimers.However, they will prime at target sites containing the correspondingWatson-Crick base partners, albeit with reduced overall frequencycompared to completely random primers. In specific embodiments, theseprimers are capable of forming primer dimers under specific conditionsbut at a greatly reduced level compared to primers lacking suchstructure.

Facilitating primers, selectively targeting bisulfate-converted DNA,comprise a 5′ known sequence RU, comprised of G and A bases, or YU,comprised of C and T bases, and a homopolymeric sequence comprised of Aor T, respectively. These primers are combined at different ratios withtheir respective degenerate counterparts. For example, a primer with aknown sequence of RU and a homopolymeric region comprised of A iscombined with a degenerate primer with a known sequence of RU and adegenerate sequence of G and A. Similarly, a primer with a knownsequence of YU and a homopolymeric region comprised of T is combinedwith a degenerate primer with a known sequence of YU and a degeneratesequence of C and T.

In some embodiments, these primers are supplemented with a completelyrandom (i.e., comprising any of the four bases) short nucleotidesequence at their 3′ end. If a limited number of completely random basesare present at the 3′ end of the Y, R, M or K primers, that willincrease their priming frequency yet maintain limited ability forself-priming. By using a different number of completely random bases atthe 3′ end of the degenerate Y, R, M or K primers, and by carefullyoptimizing the reaction conditions, one can precisely control theoutcome of the polymerization reaction in favor of the desired DNAproduct with minimum primer-dimer formation.

Thus, in the first step referred to as “library synthesis,” primers ofthe described design are randomly incorporated in anextension/polymerization reaction with a DNA polymerase possessingstrand-displacement activity. The resulting branching process createsDNA molecules having known (universal) self complementary sequences atboth ends. In a second step referred to as the “amplification” step,these molecules are amplified exponentially by polymerase chain reactionusing Taq DNA polymerase and a single primer corresponding to the known5′-tail of the random primers. FIG. 1 presents a schematic outline ofthe method of the invention. The invention overcomes major problemsknown in the art for DNA amplification by random primers.

1. Sources of DNA

DNA of any source or complexity, or fragments thereof, can be amplifiedby the method described in the invention before or after conversion withbisulfite. In specific embodiments dsDNA is denatured with heat,chemical treatment (such as alkaline pH), mechanical manipulation, or acombination thereof to generate ss DNA, wherein the ssDNA is subjectedto the methods described herein. Single-stranded DNA prepared byalkaline denaturation is treated with sodium bisulfite to chemicallyconvert substantially all cytosine bases to uracil using establishedprotocols well known in the art (Frommer et al., 1992; Grunau et al.,2001). Methylated cytosines are resistant to this chemical reaction andthus are not converted to uracil as illustrated in FIG. 3 and FIG. 4. Inspecific embodiments ds DNA is denatured with heat, chemical treatment(such as alkaline pH), mechanical manipulation, or a combination thereofto generate ssDNA, wherein the ss DNA is subjected to the methodsdescribed herein.

2. Design of Degenerate Primers

FIG. 2 illustrates the design of self-inert degenerate primers utilizedin this aspect of the invention. In principle, the oligonucleotideprimers comprise a constant known sequence at their 5′ end and adegenerate nucleotide sequence 3′ to it, each comprised of any of atleast four possible base combinations known not to participate inWatson-Crick base pairing. The possible primer compositions includepyrimidines only (C and T), purines only (A and G), or non-pairingpurines and pyrimidines (A and C or G and T). The last combination (Gand T) is known in the art to permit non-canonical Watson-Crickbase-pairing. In a preferred embodiment, the G and T pair is utilized inthe invention. In a specific embodiment, the primers comprise a constantpart of about 18 base sequence comprised of C and T, G and A, A and C,or G and T bases at the 5′ end, followed by an about 10 random Y, R, Mor K bases, respectively, and between 0 and about 6 completely randombases N at the 3′ end (FIG. 2, SEQ ID NO: 1-7). Examples 1 and 2 showthat Y and YN primers form only a limited amount of primer-dimers, andthis is proportional to the number of completely random bases (N) attheir 3′ termini. In contrast, a primer of similar design but comprisedof bases that can participate in Watson-Crick base-pairing generates anexcessive amount of primer-dimers, which greatly reduces the efficiencyof DNA amplification.

The choice of primers will depend on the base composition, complexity,and the presence and abundance of repetitive elements in the target DNA.By combining the products of individual amplification reactions withdegenerate primers comprising different non-Watson-Crick pairs, buthaving the same known sequence at the ends, one can achieve the highestpossible level of representative and uniform DNA amplification. Askilled artisan recognizes how to select the optimal primers andreaction conditions to achieve the desired result.

3. Design of Primers Targeting Sodium Bisulfite-Converted DNA

To specifically target DNA strands with chemically changed basecomposition after bisulfite conversion, self-inert degenerate primerscomprised of A and G or C and T bases are mixed with primers having thesame constant 18 base sequence at the 5′ end, followed by about 10 basesof homo-polymeric sequence comprised of A or T, respectively, andbetween 0 and about 6 completely random bases (N) at the 3′ end (FIG. 1and SEQ ID NO: 18 and SEQ ID NO: 19), herein referred to as“facilitating primers”. Thus, the primer composition is specificallyenriched for bases that will target converted DNA strands with reducedG/C and increased A/T content.

4. Choice of DNA Polymerases

In a preferred embodiment, a DNA polymerase is utilized that possessesstrand-displacement activity. Preferred strand-displacement DNApolymerases are as follows: Klenow fragment of E. coli DNA polymerase I;exo-DNA polymerases of the T7 family, i.e., polymerases that requirehost thioredoxin subunit as co-factor, such as: T7, T3, fI, fII, W31, H,Y, gh-1, SP6, or All22 (Studier, 1979); exo-Bst large fragment; Bca DNApolymerase; 9oNm polymerase; M-MuLV Reverse Transcriptase; phage f29polymerase; phage M2 polymerase; phage fPRD1 polymerase; exo- VENTpolymerase; and phage T5 exo- DNA polymerase. Klenow fragment of DNApolymerase I and phage T7 DNA polymerase with reduced or eliminated3′-5′ exonuclease activities are most preferred in the presentinvention. Thus, in a preferred embodiment the Klenow fragment of DNApolymerase I or Sequenase version 2 is used as the polymerase (Example2).

5. Reaction Conditions

In general, factors increasing priming efficiency, such as reducedtemperature or elevated salt and/or Mg²⁺ ion concentration, inhibit thestrand-displacement activity and the nucleotide incorporation rate ofDNA polymerases, and elevated temperatures and low Mg²⁺ ion or saltconcentrations increase the efficiency ofpolymerization/strand-displacement but reduce the priming efficiency. Onthe other hand, factors promoting efficient priming also increase thechances of primer-dimer formation. Strand-displacement activity can befacilitated by several protein factors. Any polymerase that can performstrand-displacement replication in the presence or in the absence ofsuch strand-displacement or processivity enhancing factors is suitablefor use in the disclosed invention, even if the polymerase does notperform strand-displacement replication in the absence of such factors.Factors useful in strand-displacement replication are (i) any of anumber of single-stranded DNA binding proteins (SSB proteins) ofbacterial, viral, or eukaryotic origin, such as SSB protein of E. coli,phage T4 gene 32 product, phage T7 gene 2.5 protein, phage Pf3 SSB,replication protein A RPA32 and RPA14 subunits (Wold, 1997); (ii) otherDNA binding proteins, such as adenovirus DNA-binding protein, herpessimplex protein ICP8, BMRF1 polymerase accessory subunit, herpes virusUL29 SSB-like protein; (iii) any of a number of replication complexproteins known to participate in DNA replication such as phage T7helicase/primase, phage T4 gene 41 helicase, E. coli Rep helicase, E.coli recBCD helicase, E. coli and eukaryotic topoisomerases (Champoux,2001).

The exact parameters of the polymerization reaction will depend on thechoice of polymerase and degenerate primers, and a skilled artisanrecognizes, based on the teachings provided herein, how to modify suchparameters. By varying the number of random bases at the 3′ end of thedegenerate primers and by carefully optimizing the reaction conditions,formation of primer-dimers can be kept to a minimum, while at the sametime the amplification efficiency and representation can be maximized.

Random fragmentation of DNA can be performed by mechanical, chemical, orenzymatic treatment. In a preferred embodiment, DNA is fragmented byheating at about 95° C. in low salt buffers such as TE (10 mM Tris-HCl,1 mM EDTA, having pH between 7.5 and 8.5) or TE-L (10 mM Tris-HCl, 0.1mM EDTA, having pH between 7.5 and 8.5) for between about 1 and about 10minutes (for example, see U.S. patent application Ser. No. 10/293,048,filed Nov. 13, 2002, now U.S. Pat. No. 7,655,791, incorporated byreference herein in its entirety).

A typical library synthesis reaction of the present invention isperformed in a reaction mixture having a volume ranging between about 10and about 25 μl. The reaction mixture preferably comprises about 0.5 toabout 100 ng of thermally or mechanically fragmented DNA, or inparticular embodiments less than about 0.5 ng DNA, about 0.5 to about 30μM of degenerate primer, about 0 to about 200 nM of known sequenceprimer (i.e., primer corresponding to the known 5′ end of the respectivedegenerate primer), between about 2 to about 10 units of Klenow Exo⁻(New England Biolabs) or Sequenase version 2 (USB Corporation), between0 and about 360 ng SSB protein, and between about 5 to 10 mM MgCl₂, andbetween 0 and about 100 mM NaCl. The reaction buffer preferably has abuffering capacity that is operational at physiological pH between about6.5 and about 9. Preferably, the incubation time of the reaction isbetween about 10 minutes to about 180 minutes, and the incubationtemperature is between about 12° C. to 37° C. Incubation is performed bycycling between about 12° C. and about 37° C. for a total of 3 to 5 minper cycle, or preferably by a single isothermal step between about 12°C. to 30° C. or sequential isothermal steps between about 12° C. to 37°C. The reaction is terminated by addition of a sufficient amount of EDTAto chelate Mg²⁺or, preferably, by heat-inactivation of the polymerase,or both.

In a preferred embodiment of the present invention, the librarysynthesis reaction is performed in a volume of about 15 μl. The reactionmixture comprises about 5 ng or less of non-converted or sodiumbisulfiteconverted fragmented DNA, about 1 μM of degenerate primer K(N)₂primer, (SEQ ID NO: 14) containing G and T bases at the known anddegenerate regions and 2 completely random 3′ bases for amplification ofnon-converted DNA or about 0.5 μM degenerate primers Y(N)₂ and R(N)₂(FIG. 1 and SEQ ID NO: 3 and SEQ ID NO: 10) comprised of A and G or Cand T bases at the known and degenerate regions and 2 completely random3′ bases (FIG. 1 and SEQ ID NO: 3 and SEQ ID NO: 10) and about 0.5 μMfacilitating primers R_(U)(A)₁₀(N)₂ and Y_(U)(T)₁₀(N)₂ (FIG. 1 and SEQID NO: 18 and SEQ ID NO: 19) having the same constant 18 base sequenceat the 5′ end as the respective degenerate primers, followed by 10 basesof homo-polymeric sequence comprised of A or T respectively and 2completely random bases N at the 3′ end for amplification ofbisulfite-converted DNA, between about 2 units and about 10 units ofKlenow Exo-DNA polymerase (NEB), between about 5 mM and about 10 mMMgCl₂, about 100 mM NaCl, about 10 mM Tris-HCl buffer having pH of about7.5, and about 7.5 mM dithiothreitol. Preferably, the incubation time ofthe reaction is between about 60 and about 120 minutes and theincubation temperature is about 24° C. in an isothermal mode or inanother preferred embodiment by sequential isothermal steps at betweenabout 16° C. and about 37° C.

A typical amplification step with known sequence primer comprisesbetween about 1 and about 10 ng of library synthesis products andbetween about 0.3 and about 2 μM of known sequence primer in standardPCR reaction well known in the art, under conditions optimal for athermostable DNA polymerases, such as Taq DNA polymerase, Pfupolymerase, or derivatives and mixtures thereof.

TABLE I OLIGONUCLEOTIDE SEQUENCES No Code Sequence 5′-3′′*  1. YCCTTTCTCTCCCTTCTCTYYYYYYYYYY (SEQ ID NO: 1)  2. YNCCTTTCTCTCCCTTCTCTYYYYYYYYYYN (SEQ ID NO: 2)  3. Y(N)₂CCTTTCTCTCCCTTCTCTYYYYYYYYYYNN (SEQ ID NO: 3)  4. Y(N)₃CCTTTCTCTCCCTTCTCTYYYYYYYYYYNNN (SEQ ID NO: 4)  5. Y(N)₄CCTTTCTCTCCCTTCTCTYYYYYYYYYYNNNN (SEQ ID NO: 5)  6. Y(N)₅CCTTTCTCTCCCTTCTCTYYYYYYYYY (SEQ ID NO: 6)  7. Y(N)₆CCTTTCTCTCCCTTCTCTYYYYYYYYY (SEQ ID NO: 7)  8 Y_(U)CCTTTCTCTCCCTTCTCT (SEQ ID NO: 8)  9. TemplateGTAATACGACTCACTATAGGRRRRRRRRRR (SEQ ID NO: 9) 10. R(N)₂AGAGAAGGGAGAGAAAGGRRRRRRRRRRNN (SEQ ID NO: 10) 11. R_(U)AGAGAAGGGAGAGAAAGG (SEQ ID NO: 11) 12. M(N)₂CCAAACACACCCAACACAMMMMMMMMMMNN (SEQ ID NO: 12) 13. M_(U)CCAAACACACCCAACACA (SEQ ID NO: 13) 14. K(N)₂TGTGTTGGGTGTGTTTGG (SEQ ID NO: 14) 15. K_(U)TGTGTTGGGTGTGTTTGG (SEQ ID NO: 15) 16 T7(N)₆GTAATACGACTCACTATAGGNNNNNN (SEQ ID NO: 16) 17. T7GTAATACGACTCACTATAGG (SEQ ID NO: 17) 18. R_(U)(A)₁₀(N)₂AGAGAAGGGAGAGAAAGGAAAAAAAAAANN (SEQ ID NO: 18) 19. Y_(U)(T)₁₀(N)₂CCTTTCTCTCCCTTCTCTTTTTTTTTTTNN (SEQ ID NO: 19) 20. RH93704 FGTACTCCCATTCCTGCCAAA** (SEQ ID NO: 20) 21. RH93704 BTAAACATAGCACCAAGGGGC** (SEQ ID NO: 21) 22. Met RH93704 FATACTCCCATTCCTACCAAA (SEQ ID NO: 22) 23. Met RH93704 BTAAATATAGTATTAAGGGGT (SEQ ID NO: 23) 24. p15 Neg FCCTCTGCTCCGCCTACTGG (SEQ ID NO: 24) 25. p15 Neg BCACCGTTGGCCGTAAACTTAAC (SEQ ID NO: 25) 26. p16 Neg FCAGAGGGTGGGGCGGACCGC (SEQ ID NO: 26) 27. p16 Neg BCCGCACCTCCTCTACCCGACCC (SEQ ID NO: 27) 28. E-Cad Neg FGCTAGAGGGTCACCGCGT (SEQ ID NO: 28) 29. E-Cad Neg BCTGAACTGACTTCCGCAAGCTC (SEQ ID NO: 29) 30. GSTP-1 Neg FGTGAAGCGGGTGTGCAAGCTC (SEQ ID NO: 30) 31. GSTP1 Neg BCGAAGACTGCGGCGGCGAAAC (SEQ ID NO: 31) 32. T7GGAGTAATACGACTCACTATAGG (SEQ ID NO: 32) 33. T7GGNAGTAATACGACTCACTATAGGN (SEQ ID NO: 33) 34. T7SHCCTATAGTGAGT/3AmMC7/*** (SEQ ID NO: 34) 35. T7NSHNCCTATAGTGAGT/3AmMC7/*** (SEQ ID NO: 35) 36. T7-C₁₀CCCCCCCCCCGTAATACGACTCACTATAGG (SEQ ID NO: 36) 37. T7GTAATACGACTCACTATA (SEQ ID NO: 37) 38. C₁₀ CCCCCCCCCC (SEQ ID NO: 38)39. p15 5′-Flank TGCCACTCTCAATCTCGAACTA (SEQ ID NO: 39) 40. p16 3′-FlankGCGCTACCTGATTCCAATTCCCC (SEQ ID NO: 40) 41. E-cad 5′-FlankCATAGGTTTGGGTGAACTCTAA (SEQ ID NO: 41) 42. E-cad 3′-FlankGGCCTTTCTTCTAACAATCAG (SEQ ID NO: 42) 43. Adapt BackboneTGAGGTTGTTGAAGCGTTUACCCAAUTCGATUAGGCAA/3AmMC7/*** (SEQ ID NO: 43) 44.Adapt Biot Biot-TTGCCTAATCGAATTGGGTAAACG (SEQ ID NO: 44) 45. Adapt NickCTTCAACAACCTCA/3AmMC7/*** (SEQ ID NO: 45) 46. p15 Nick FAGGTGCAGAGCTGTCGCTTTC (SEQ ID NO: 46) 47. p15 Nick BCACTGCCCTCAGCTCCTAATC (SEQ ID NO: 47) 48. p16 Nick FGGTAGGGGGACACTTTCTAGTC (SEQ ID NO: 48) 49. p16 Nick BAGGCGTGTTTGAGTGCGTTC (SEQ ID NO: 49) 50. E-Cad Nick FCCAAGGCAGGAGGATCGC (SEQ ID NO: 50) 51. E-Cad Nick BTCAGAAAGGGCTTTTACACTTG (SEQ ID NO: 51) 52. E-Cad AddGTGAGCTGTGATCGCACCA (SEQ ID NO: 52) 53. E-Cad AddGCGGTGACCCTCTAGCCT (SEQ ID NO: 53) 54. GT shortCCAAACACACCC/3AmMC7/*** (SEQ ID NO: 54) 55. T7SH-2NNNCCTATAGTGAGT/3AmMC7/*** (SEQ ID NO: 55) 56. T7SH-3NNNNCCTATAGTGAGT/3AmMC7/*** (SEQ ID NO: 56) 57. T7SH-4NNNNNCCTATAGTGAGT/3AmMC7/*** (SEQ ID NO: 57) 58. T7SH-5NNNNNNCCTATAGTGAGT/3AmMC7/*** (SEQ ID NO: 58) 59. T7SH-6NCCTATAGTGAGT/3AmMC7/*** (SEQ ID NO: 59) 60. GTSH-6NCCAAACACAC/3AmMC7/*** (SEQ ID NO: 60) 61. p16 SH-FGGTAGGGGGACACTTTCTAGTC (SEQ ID NO: 61) 62. p16 SH-BAGGCGTGTTTGAGTGCGTTC (SEQ ID NO: 62) 63. p15 SFGCGCGCGATCCAGGTAGC (SEQ ID NO: 63) 64. p15 SBTAGGTTCCAGCCCCGATCCG (SEQ ID NO: 64) 65. p16 LFGGTGCCACATTCGCTAAGTGC (SEQ ID NO: 65) 66. p16 LBGCTGCAGACCCTCTACCCAC (SEQ ID NO: 66) 67. E-Cad LBCAGCAGCAGCGCCGAGAGG (SEQ ID NO: 67) 68. GSTP1 Neg B2CCTGGAGTCCCCGGAGTCG (SEQ ID NO: 68) *Random bases definitions: N = A, C,G, or T; Y = C or T; R = A or G; M = A or C; K = G or T **Primers to STSmarker sequence RH93704 are from the UniSTS database at the NationalCenter for Biotechnology Information′s website. ***/3AmMC7/ = amino C7modifierB. Analysis of DNA Methylation Following Cleavage with McrBCEndonuclease

Methylation of cytosines in the 5′ position of the pyrimidine ring isthe most important epigenetic alteration in eukaryotic organisms. Inanimals and humans, methylcytosine is predominantly found incytosine-guanine (CpG) dinucleotides, whereas in plants it is morefrequently located in cytosine-any base-guanine trinucleotides (CpNpG)(Fraga and Esteller, 2002). Two alternative groups of methods arecurrently used to study the degree of methylation in DNA samples:non-bisulfite and bisulfite conversion. The first relies on the use ofmethylation-sensitive restriction endonucleases combined with, forexample, Southern blot or PCR detection. The second utilizes PCRamplification of bisulfite-converted DNA. Both methods suffer fromsignificant drawbacks. Whereas the former is limited by the availabilityof suitable restriction sites, and the specificity ofmethylation-sensitive enzymes, the latter is limited by the amount ofDNA left after chemical conversion, incomplete denaturation, and/orincomplete desulfonation. In addition, bisulfite conversion is tedious,time-consuming, and requires a great deal of empirical optimization ofspecific primers and PCR conditions for the converted DNA.

In the present invention, there is a novel use of the unique propertiesof the exemplary E. coli endonuclease McrBC and its utility in theanalysis of methylation in specific genomic regions.

In embodiments of the present invention there is a novel use of McrBCDNA endonuclease comprising digestion of genomic DNA to produce aplurality of ends originating from cleavage between DC^(m) (A/GC^(m))recognition sites separated by about 35 and about 3000 bases. Inspecific embodiments, digestion with McrBC is incomplete and results inpredominant cleavage of a subset of sites separated by about 35 andabout 200 bases. In other specific embodiments, cleavage is complete andresults in digestion of substantially all possible cleavage sites.

In a specific embodiment of the invention, PCR amplification withprimers flanking a region analyzed for methylation is performedfollowing McrBC cleavage. The presence of methylation sites recognizedby the McrBC endonuclease results in at least one cleavage event betweenthe priming sites and thus results in a lack of amplification products.The sensitivity of detection decreases in this McrBC-mediated directpromoter methylation assay if a mixture of methylated and non-methylatedDNA is analyzed, as is often the case with clinical samples containing afew malignant cells amidst a large number of non-malignant cells. Thus,there is a necessity for developing a DNA methylation assay formethylation analysis of samples containing a mixture of different cells.

In embodiments of the present invention, the present inventors takeadvantage of the frequency of McrBC recognition sites and the kineticdifferences between hypermethylated sites and sites with low levels ofmethylation or a lack of methylation.

In a specific embodiment, DNA termini produced by cleavage with McrBCare modified by ligation of universal adaptor sequences followed byincorporation of short homopolymeric sequence that allow multiplexedasymmetric one-sided PCR amplification between the universal terminalsequence and sites internal to, or flanking, the hypermethylated region.

In another specific embodiment of the invention, DNA termini produced bycleavage with McrBC are modified by ligation of biotinylatednick-attaching adaptor sequences. The nicks are propagated to acontrolled distance from the adaptor, and the uniformly sizednick-translation products are immobilized on a solid phase and analyzedfor the presence of sequences internal to, or flanking, a methylationsite. The McrBC libraries of this type can be used for discovery ofunknown hypermethylated promoters or imprinted genes by sequencing or byhybridization to microarrays.

In another specific embodiment, 3′ recessed ends of McrBC cleavage sitesare extended in the presence of a biotin-comprising nucleotide analog,followed by DNA fragmentation, immobilization on solid support, and/oranalysis for sequences internal to, or flanking, a methylation site.McrBC libraries of this type can also be used for discovery of unknownhypermethylated sites by sequencing or by hybridization to microarrays.

In a preferred embodiment of the invention, libraries comprising shortamplifiable DNA sequences generated by McrBC cleavage from promotersites are utilized. These short sequences will be present only if aparticular promoter is methylated, and thus comparative hybridizationand/or amplification can be used for genome-wide analysis andquantification of the methylation pattern at promoter CpG sites. First,genomic DNA from test and control samples is cleaved with McrBCendonuclease. Universal adaptor sequences are then ligated to theoverhangs produced by the enzyme, and short fragments are amplifiedeither prior to, or following, size separation of the DNA. The method ofsize separation could be any of a number of physical DNA fractionationmethods well known in the art, such as gel electrophoresis, sizeexclusion chromatography, or membrane micro-filtration, for example. Ina specific embodiment of the invention, the size fractionation isachieved by a membrane micro-filtration process. In another specificembodiment, separation is carried out by size-selective DNAamplification. Analysis and quantification of promoter-specific shortfragments in the amplified libraries are conducted by comparativehybridization and/or amplification. The magnitude of the signal will beproportional to the level of methylation of the promoter site beinginvestigated. An added advantage to the quantitative aspect of themethod described in this embodiment is the potential of physical mappingof methylation patterns by hybridization to, for example, a microarraycomprising a tiled path of short promoter sequences.

1. Sources of DNA

Genomic DNA of any source or complexity, or fragments thereof, can beanalyzed by the methods described in the invention. Clinical samplesrepresenting biopsy materials, pap smears, DNA from blood cells, serum,plasma, or other body fluids, or DNA isolated from cultured primary orimmortalized tissue cultures, for example, can be used as a source formethylation analysis.

2. McrBC Cleavage

In embodiments of the present invention DNA is digested with McrBCendonuclease in the presence of GTP as the energy source for subunittranslocation. A typical digestion with McrBC endonuclease is performedin a volume ranging from about 5 μl to about 50 μl in buffer containingabout 50 mM NaCl, about 10 mM Tris-HCl having pH of about 7.5 to about8.5, about 100 μg/ml of bovine serum albumin, about 0.5 to about 2 mMGTP, and about 0.2 to about 20 units of McrBC endonuclease. Thetemperature of incubation is between about 16° C. and about 42° C. andthe duration is between about 10 minutes and about 16 hours. Thequanitity of DNA in the reaction is between 50 pg and 10 μg. It shouldbe noted that McrBC makes one cut between each pair of half-sites,cutting close to one half-site or the other, but cleavage positions aredistributed over several base pairs approximately 30 base pairs from themethylated base (Panne et al., 1999) resulting in a smeared patterninstead of defined bands. In specific embodiments, digestion with McrBCis incomplete and results in predominant cleavage of a subset of sitesseparated by about 35 and about 250 bases. In other specificembodiments, cleavage is complete and results in digestion ofsubstantially all possible cleavage sites. Example 3 describes theoptimization of the cleavage of human genomic DNA and analysis of thetermini produced by McrBC. It should be noted that from the existingliterature the nature of the ends produced by McrBC digestion is notunderstood. Example 9 also details the analysis of the nature of theends produced by McrBC cleavage.

3. Direct Analysis of DNA Methylation by PCR Following McrBC Cleavage

In a preferred embodiment, following McrBC cleavage of genomic DNA,aliquots of digested DNA or control non-digested DNA, are amplified byPCR using primers specific to known methylation sites within promoterCpG islands involved in epigenetic control of carcinogenesis. A typicalreaction mixture comprises 1× Titanium Taq reaction buffer (Clontech),about 200 μM of each dNTP, about 4% DMSO, about 200 nM of primersspecific for CpG regions of a methylation site of interest, and about 2units of Titanium Taq polymerase (Clontech) in a reaction volume ofbetween about 20 and about 50 μl. Cycling conditions vary depending onthe melting temperatures of the primers and the length of the amplifiedproduct. Control samples of non-digested DNA are included in parallelwith the analyzed samples, along with positive controls of genomic DNAthat is fully methylated with Sssl CpG methylase. Aliquots of the PCRreactions are analyzed on a 1% agarose gel after staining with ethidiumbromide. If at least one cleavage event occurs between priming sitesthat flank McrBC recognition half-sites, no PCR product will beamplified. The assay is thus reducing the signal, or is producing anegative signal, that correlates with methylation of cytosines (seeExample 5 and FIG. 12).

4. Analysis of DNA Methylation by One-Sided PCR from McrBC CleavageSites

Example 6 provides a description of another aspect of the presentinvention regarding development of McrBC-mediated library assays for DNAmethylation based on ligation of a universal adaptor to McrBC cleavagesites, followed by incorporation of a poly-C tail allowing one-sided PCRbetween the homopolymeric sequence and a specific site flanking themethylated region. The McrBC libraries of this type can be used forcancer diagnostics, gene imprinting, and developmental research studies,as well as for discovery of unknown hypermethylated genomic regions, forexample.

In a typical reaction about 100 ng or less of McrBC digested DNA istreated with Klenow fragment of DNA polymerase I to produce blunt endsin about 10 to about 100 μl of 1× T4 Ligase buffer (NEB) containingabout 2 to about 20 nM of each dNTP, at about 25° C., for about 15minutes to overnight. The ligation reaction comprises 1× T4 Ligasebuffer (NEB), 100 ng or less of blunt-end template DNA, 3.75 μM finalconcentration of universal T7 adaptors (see Example 6), and2,000-2,000,000 units of T4 DNA Ligase at about 16° C. to about 25° C.for about 1 hour to overnight. Homo-polymeric extensions are nextincorporated at the ends of the ligated fragments using a T7-C₁₀ (SEQ IDNO: 36) comprising ten 5′ cytosine bases and a 3′ T7 promoter sequence.A critical feature of this sequence is that it allows asymmetricone-sided PCR amplification due to the strong suppression effect of theterminal poly-G/poly-C duplex making the amplification between theterminal inverted repeats substantially inefficient (U.S. patentapplication Ser. No. 10/293,048, filed Nov. 13, 2002, now U.S. Pat. No.7,655,791; U.S. patent application Ser. No. 10/797,333, filed Mar. 8,2004, published as U.S. Patent Application Publication No.: 2004/0209299and is now abandoned; U.S. patent application Ser. No. 10/795,667, filedMar. 8, 2004, now U.S. Pat. No. 7,718,403). The amplification reactioncomprises about 1 to about 5 ng of McrBC library DNA with ligateduniversal T7 adaptors, about 1× Taq polymerase, about 200 μM of eachdNTP, and about 1 μM universal T7-C₁₀ primer (SEQ ID NO: 36). Inaddition, fluorescein calibration dye (FCD) and SYBR Green I (SGI) maybe added to the reaction to allow monitoring of the amplification usingreal-time PCR by methods well known in the art. PCR is carried out at72° C. for 15 minutes to “fill-in” the 3′-recessed ends of the T7adaptor sequence, followed by a 2-step cycling protocol of 94° C. for 15seconds, 65° C. for 2 minutes for the optimal number of cycles. Optimalcycle number is determined by analysis of DNA production using eitherreal-time PCR or optical density. Typically, about 3 to 5 μg ofamplified DNA can be obtained from a 25 μl reaction using optimizedconditions.

To analyze the methylation status of promoter CpG islands, one-sided PCRis performed using about 20 to about 50 ng of purified McrBC library DNAprepared as described above from control and test cells, a universal C₁₀primer comprising ten C bases (SEQ ID NO: 38), and primers specific forregions flanking the CpG islands of different promoters implicated inepigenetic control of carcinogenesis. The amplification reactioncomprises about 20 to about 50 ng of McrBC library DNA, about 1× Taqpolymerase, about 200 μM of each dNTP, about 4% DMSO, and about 1 uMuniversal C₁₀ primer (SEQ ID NO: 38). In addition, fluoresceincalibration dye (FCD) and SYBR Green I (SGI) may be added to thereaction to allow monitoring of the amplification using real-time PCR bymethods well known in the art. PCR is carried out under optimalconditions for annealing temperature, extension time, and cycle numberdepending on the melting temperature and length of the amplifiedproduct.

Since the amplification involves the boundaries of hypermethylatedgenomic regions, a skilled artisan will recognize that flanking regionsof different promoters will have different levels of methylation. Thisfact should be taken into consideration when designing primers forone-sided PCR. For example, the transcribed regions adjacent to the 3′end of most CpG islands in normal cells are known to be heavilymethylated, whereas for promoters involved in epigenetic control ofcarcinogenesis in cancer cells, these regions are largely hypomethylated(Baylin and Herman, 2000). Generally, primers located at a distancebetween about 300 to 700 bases from the boundary of a CpG island arewell suited for analysis of methylation. Example 6 and FIG. 14demonstrate the sensitivity limits of the McrBC-mediated librarypromoter methylation assay described herein. As little as 0.1% of cancerDNA can be detected in a background of 99.9% of normal DNA (see FIG.14).

5. DNA Libraries Prepared by Nick-Translation from McrBC Cleavage Sitesand their Utility for DNA Methylation Analysis

In a preferred embodiment of the present invention, an McrBC-mediatedlibrary promoter methylation diagnostic assay is described utilizingligation of nick-attaching biotinylated adaptor to McrBC cleavage sites,propagation of the nick to a controlled distance from the adaptor,immobilization of the uniformly sized nick-translation products on asolid support, and analysis of sequences internal to, or flanking, amethylation site, for example a CpG island (see Example 7). The McrBClibraries of this type can be used for cancer diagnostics, geneimprinting, and developmental research studies, as well as for discoveryof unknown hypermethylated genomic regions.

In a typical library synthesis reaction, about 100 to about 1000 ng ofMcrBC digested DNA is treated with Klenow fragment of DNA polymerase Ito produce blunt ends in about 10 to about 100 μl of 1× T4 Ligase buffer(NEB), containing about 2 to about 20 nM of each dNTP, at about 25° C.for about 15 minutes to overnight. The ligation reaction comprises 1× T4Ligase buffer (NEB), 100 ng or less of blunt-end template DNA, 3.75 μMfinal concentration of biotinylated nick-attaching adaptor (see Example7), and 2,000 to about 2,000,000 units of T4 DNA Ligase at about 16° C.to about 25° C. for about 1 hour to overnight. Samples are purified andfurther subjected to nick-translation in about 100 μl of 1× ThermoPolbuffer (NEB) containing about 200 μM of each dNTP, and about 5 units ofwild type Taq polymerase at about 45° C. to about 65° C. for about 1 toabout 5 minutes. The nick-translation products are denatured and boundto streptavidin magnetic beads. After washing of the unbound material,aliquots of the beads are either directly analyzed for the presence ofsequences internal to, or flanking, hypermethylated sites, or the DNA isfurther amplified using self-inert degenerate primers and the Klenowfragment of DNA polymerase I (see U.S. Provisional Patent Application60/453,060, filed Mar. 7, 2003, and the U.S. Nonprovisional applicationclaiming priority to same, filed concomitantly herewith), and thenanalyzed similarly. For direct methylation analysis, aliquots of thestreptavidin beads suspensions are amplified in reactions comprisingabout 200 μM of each dNTP, about 4% DMSO, about 200 nM each forward andreverse primer, and about 5 units of Taq polymerase. In addition,fluorescein calibration dye (FCD) and SYBR Green I (SGI) may be added tothe reaction to allow monitoring of the amplification using real-timePCR by methods well known in the art. PCR is carried out under optimalconditions for annealing temperature, extension time, and cycle number,depending on the annealing temperature and length of the amplifiedproduct.

In order to produce sufficient amounts of the McrBC library DNA foranalysis of multiple methylation sites or for microarray analysis ofunknown hypermethylation sites, aliquots of the DNA bound to themagnetic beads may be amplified in a reaction comprising about 50 toabout 500 μg magnetic beads, about 0.05 to about 1 μM universal K_(U)primer (SEQ ID NO: 15), about 4% DMSO, about 200 μM 7-deaza-dGTP(Sigma), and about 5 units of Taq polymerase. PCR is carried out using acycling protocol of 94° C. for 15 seconds, 65° C. for 2 minutes for theoptimal number of cycles. Aliquots of the amplified DNA are thenanalyzed for the presence of sequences internal to, or flanking,hypermethylated sites, or hybridized to microarrays for discovery ofunknown methylation sites.

6. Preparation of DNA Libraries by Direct Biotin Incorporation at McrBCCleavage Sites for DNA Methylation Analysis

Example 8 describes another aspect of the present invention in which aMcrBC-mediated library promoter methylation diagnostic assay isdeveloped by extension of the 3′ recessed ends of McrBC cleavage sitesin the presence of a biotin-containing nucleotide analog, followed byDNA fragmentation, immobilization on a solid support, and analysis ofsequences internal to, or flanking, a methylation site, such as apromoter CpG island. The McrBC libraries of this type can be used forcancer diagnostics, gene imprinting, and developmental research studies,as well as for discovery of unknown hypermethylated genomic regions.

In a typical library preparation, about 100 to about 1000 ng of McrBCdigested DNA are labeled in a reaction comprising about 20 nM of eachdNTP, about 20 to about 50 nM of biotin-containing nucleotide analogeither completely substituting or in about equal ratio with thecorresponding unlabeled nucleotide, about 5 to about 20 units of theKlenow Exo-fragment of DNA polymerase I or about 5 to about 10 units ofwild type Taq polymerase at about 25° C. (in the case of Klenow) or atabout 55° C. (in the case of Taq polymerase) for about 20 to about 120minutes. After removal of the free biotin analog the labeled DNA isfragmented by heating at 95° C. in TE buffer for about 2 to about 8minutes (see, for example, U.S. patent application Ser. No. 10/293,048,filed Nov. 13, 2002, now U.S. Pat. No. 7,655,791), snap-cooled on icefor about 5 minutes, and bound to streptavidin magnetic beads. Afterwashing of the unbound material, aliquots of the beads are eitherdirectly analyzed for the presence of sequences internal to, orflanking, hypermethylated sites, or the DNA is further amplified usingself-inert degenerate primers and the Klenow fragment of DNA polymeraseI (see, for example, U.S. patent application Ser. No. 10/795,667, filedMar. 8, 2004, now U.S. Pat. No. 7,718,403), and then analyzed similarly.For direct methylation analysis, aliquots of the streptavidin beadssuspension are amplified in a reaction comprising about 200 μM of eachdNTP, about 4% DMSO, about 200 nM each forward and reverse primer, andabout 5 units of Taq polymerase. In addition, fluorescein calibrationdye (FCD) and SYBR Green I (SGI) may be added to the reaction to allowmonitoring of the amplification using real-time PCR by methods wellknown in the art. PCR is carried out under optimal conditions forannealing temperature, extension time, and cycle number, depending onthe annealing temperature and length of the amplified product.

In order to produce sufficient amounts of the McrBC library DNA foranalysis of multiple methylation sites, or for microarray analysis ofunknown hypermethylation sites, aliquots of the DNA bound to magneticbeads are amplified in a reaction comprising about 50 to about 500 μgmagnetic beads, about 0.05-1 μM universal K_(U) primer (SEQ ID NO: 15),about 4% DMSO, about 200 μM 7-deaza-dGTP (Sigma), and about 5 units ofTaq polymerase. PCR is carried out using cycling protocol of 94° C. for15 seconds, 65° C. for 2 minutes for the optimal number of cycles.Aliquots of the amplified DNA are then analyzed for the presence ofsequences internal to, or flanking, hypermethylated sites, or hybridizedto microarrays for the discovery of unknown methylation sites.

7. Preparation of Libraries from Short DNA Fragments Produced by McrBCCleavage for Analysis of Promoter Hypermethylation

Examples 10, 11 and 12 describe the preparation of libraries comprisingshort amplifiable DNA sequences generated by McrBC cleavage of promotersites. First, genomic DNA from test and control samples is cleaved withMcrBC. Universal adaptor sequences are then ligated to the overhangsproduced by the nuclease, and short fragments are amplified either priorto, or following, size separation of DNA.

Size separation can be achieved by any of a number of physical sizefractionation methods well known in the art, such as gelelectrophoresis, size exclusion chromatography, or membranemicro-filtration, for example. Alternatively, separation is achieved bysize-selective DNA amplification.

Analysis and quantification of promoter-specific short fragments isaccomplished by comparative hybridization and/or amplification. Themagnitude of the signal is proportional to the level of methylation ofthe promoter site.

In a typical McrBC cleavage reaction, aliquots of about 1 to about 50 ngof test and control genomic DNA are digested with about 0.1 to about 10units of McrBC endonuclease. After inactivation of the McrBC enzyme theproducts of digestion are incubated in a ligation reaction comprising T4ligase buffer, about 200 nM to about 1 μM of universal adaptors with 5′overhangs comprising about 5 or 6 completely random bases, and about 200to 2,500 units of T4 DNA ligase for about 1 hour to overnight at about16° C. to about 25° C. The T4 DNA ligase is inactivated for 10 minutesat 65° C. and the resulting DNA molecules are either size-fractionatedby applying any of a number of physical size fractionation methods wellknown in the art, such as gel electrophoresis, size exclusionchromatography, or membrane micro-filtration, or by size selective DNAamplification. In preferred embodiments, the method of sizefractionation is micro-filtration through a membrane filter. Theligation reactions are supplemented with about 50 mM to about 250 mMNaCl, and DNA is passed through Microcon YM-100 filters (Millipore) at500× g at ambient temperature. Under these ionic strength conditions theMicrocon filters retain DNA fragments above approximately 250 bp.

The small fragments in the filtrate fractions are then concentrated byethanol precipitation and used in PCR amplification reactions (seebelow). In other preferred embodiments size separation is achieved byselective amplification using two different universal adaptor sequencesand reduced extension times (Example 12). The 3′ ends of the universaladaptor are first filled in by extension and the libraries are amplifiedby PCR in a reaction comprising about 0.25 to about 1 μM universaladaptor primer(s), about 200 μM of each dNTP, about 4% DMSO, and about 5units of Taq DNA polymerase. PCR is carried out using a cycling protocolof 94° C. for 15 seconds, 65° C. for 15 seconds (in the case ofsize-selective amplification) or 2 minutes (in the case of librariesthat are size fractionated by microfiltration) for the optimal number ofcycles. In addition, fluorescein calibration dye (FCD) and SYBR Green I(SGI) may be added to the reaction to allow monitoring of theamplification using real-time PCR by methods well known in the art.Aliquots of the amplified DNA are then analyzed for the presence ofsequences internal to, or flanking, promoter CpG islands. This can beachieved by comparative hybridization and/or amplification. Themagnitude of the signal is proportional to the level of methylation of apromoter site.

C. Amplification and Identification of Methylated Restriction SitesUsing Methylation-Sensitive Restriction Enzyme Digestion of DNA, WholeGenome Amplification, Restriction Digestion with the Same Enzyme, andSite-Specific Genome Amplification

In this embodiment, there are methods of preparing a library of DNAmolecules in such a way as to select for molecules adjacent tomethylated CpG's that are contained in a methylation-sensitiverestriction enzyme recognition site. A list of exemplarymethylation-sensitive restriction enzymes is presented in Table III. Thechoice of restriction enzyme defines the sites that will be targetedduring library preparation and amplification. The presence of a specificsite in the final amplified product will indicate that the adjacent CpGcontained in the methylation-sensitive restriction site was methylated.Furthermore, use of control DNA that is not digested by the restrictionenzyme during the initial library preparation will allow validation ofthe selection of each site during library preparation and amplification.

1. Digestion of Genomic DNA with a Methylation-Sensitive RestrictionEndonuclease

In a specific embodiment, genomic DNA is digested with amethylation-sensitive restriction endonuclease, such as Not I. Thedigestion reaction comprises about 50 ng to 5 μg of genomic DNA, 1×reaction buffer, and 1 to about 25 U of Not I restriction endonuclease.The mixture is incubated at 37° C. for 12 to 16 hours to ensure completedigestion. The enzyme is inactivated at 65° C. for 15 minutes and thesample is precipitated and resuspended to a final concentration of 1 to50 ng/ul. Genomic DNA that has not been digested is used as a positivecontrol during library preparation and analysis, for example.

2. Preparation of Randomly Fragmented DNA

Generally, a library is prepared in at least 4 steps: first, randomlyfragmenting the DNA into pieces, such as with an average size betweenabout 500 bp and about 4 kb; second, repairing the 3′ ends of thefragmented pieces and generating blunt, double stranded ends; third,attaching universal adaptor sequences to the 5′ ends of the fragmentedpieces; and fourth, filling in of the resulting 5′ adaptor extensions.In an alternative embodiment, the first step comprises obtaining DNAmolecules defined as fragments of larger molecules, such as may beobtained from a tissue (for example, blood, urine, feces, and so forth),a fixed sample, and the like, and may comprise substantially fragmentedDNA. Such DNA may comprise lesions including double or single strandedbreaks.

A skilled artisan recognizes that random fragmentation can be achievedby at least three exemplary means: mechanical fragmentation, chemicalfragmentation, and/or enzymatic fragmentation.

3. Repairing of the 3′ Ends of the Fragmented Pieces and Preparation ofBlunt Double Stranded Ends

a. Repair of Mechanically Fragmented DNA

Mechanical fragmentation can occur by any method known in the art,including hydrodynamic shearing of DNA by passing it through a narrowcapillary or orifice (Oefner et al., 1996; Thorstenson et al., 1998),sonicating the DNA, such as by ultrasound (Bankier, 1993), and/ornebulizing the DNA (Bodenteich et al., 1994). Mechanical fragmentationusually results in double strand breaks within the DNA molecule.

DNA that has been mechanically fragmented has been demonstrated to haveblocked 3′ ends that are incapable of being extended by Taq polymerasewithout a repair step. Furthermore, mechanical fragmentation utilizing ahydrodynamic shearing device (such as HydroShear; GeneMachines, PaloAlto, Calif.) results in at least three types of ends: 3′ overhangs, 5′overhangs, and blunt ends. In order to effectively ligate the adaptorsto these molecules and extend these molecules across the region of theknown adaptor sequence, the 3′ ends need to be repaired so thatpreferably the majority of ends are blunt. This procedure is carried outby incubating the DNA fragments with a DNA polymerase having both 3′exonuclease activity and 3′ polymerase activity, such as Klenow or T4DNA polymerase (see, for example, U.S. patent application Ser. No.10/293,048, filed Nov. 13, 2002, now U.S. Pat. No. 7,655,791), or with amixture of enzymes that separately comprise the 3′ exonuclease activityand the 3′ polymerase activity. Although reaction parameters may bevaried by one of skill in the art, in an exemplary embodiment incubationof the DNA fragments with Klenow in the presence of 40 nmol dNTP and 1×T4 DNA ligase buffer results in optimal production of blunt endmolecules with competent 3′ ends.

Alternatively, Exonuclease III and T4 DNA polymerase can be utilized toremove 3′ blocked bases from recessed ends and extend them to form bluntends (U.S. Pat. No. 6,197,557). In a specific embodiment, an additionalincubation with T4 DNA polymerase or Klenow maximizes production ofblunt ended fragments with 3′ ends that are competent to undergoligation to the adaptor.

In specific embodiments, the ends of the double stranded DNA moleculesstill comprise overhangs following such processing, and particularadaptors are utilized in subsequent steps that correspond to theseoverhangs.

b. Repair of Chemically Fragmented DNA

Chemical fragmentation of DNA can be achieved by any method known in theart, including acid or alkaline catalytic hydrolysis of DNA (Richardsand Boyer, 1965), hydrolysis by metal ions and complexes (Komiyama andSumaoka, 1998; Franklin, 2001; Branum et al., 2001), hydroxyl radicals(Tullius, 1991; Price and Tullius, 1992) and/or radiation treatment ofDNA (Roots et al., 1989; Hayes et al., 1990). Chemical treatment couldresult in double or single strand breaks, or both.

In a specific embodiment, chemical fragmentation occurs by heat (see,for example, U.S. patent application Ser. No. 10/293,048, filed Nov. 13,2002, now U.S. Pat. No. 7,655,791). In a further specific embodiment, atemperature greater than room temperature, in some embodiments at leastabout 40° C., is provided. In alternative embodiments, the temperatureis ambient temperature. In further specific embodiments, the temperatureis between about 40° C. and 120° C., between about 80° C. and 100° C.,between about 90° C. and 100° C., between about 92° C. and 98° C.,between about 93° C. and 97° C., or between about 94° C. and 96° C. Insome embodiments, the temperature is about 95° C.

In a specific embodiment, DNA that has been chemically fragmented existsas single stranded DNA and has been demonstrated to have blocked 3′ends. In order to generate double stranded 3′ ends that are competent toundergo ligation, a fill-in reaction with random primers and DNApolymerase that has 3′-5′ exonuclease activity, such as Klenow, T4 DNApolymerase, or DNA polymerase I, is performed (see, for example U.S.patent application Ser. No.10/797,333, filed Mar. 8, 2004, published asU.S. Patent Application Publication No.: 2004/0209299 and is nowabandoned). This procedure results in several types of moleculesdepending on the polymerase used and the conditions of the reaction. Inthe presence of a non strand-displacing polymerase, such as T4 DNApolymerase, fill-in with phosphorylated random primers will result inmultiple short sequences that are extended until they are stopped by thepresence of a downstream random-primed fragment. This will result in twoends that are competent to undergo ligation (FIG. 39). Astrand-displacing enzyme such as Klenow will result in displacement ofdownstream fragments that can subsequently be primed and extended. Thiswill result in production of a branched structure that has multiple endscompetent to undergo ligation in the next step (FIG. 40). Finally, useof an enzyme with nick translation ability, such as DNA polymerase I,will result in nick translation of all fragments leading to a singlesecondary strand capable of ligation (FIG. 41). A skilled artisanrecognizes that nick translation comprises a coupledpolymerization/degradation process that is characterized by coordinated5′-3′ DNA polymerase activity and 5′-3′ exonuclease activity. The twoenzymatic activities are usually present within one enzyme molecule (asin the case of Taq DNA polymerase or DNA polymerase I), however nicktranslation may also be achieved by simultaneous activity of multipleenzymes exhibiting separate polymerase and exonuclease activities.Incubation of the DNA fragments with Klenow in the presence of 0.1 to 10pmol of phosphorylated primers in a two temperature protocol (37° C.,12° C.) results in optimal production of blunt end fragments with 3′ends that are competent to undergo ligation to the adaptor.

c. Repair of Enzymatically Fragmented DNA

Enzymatic fragmentation of DNA may be utilized by standard methods inthe art, such as by partial restriction digestion by Cvi JI endonuclease(Gingrich et al., 1996), or by DNAse I (Anderson, 1981; Ausubel et al.,1987), for example. Fragmentation by DNAse I may occur in the presenceof about 1 to 10 mM Mg²⁺ ions (predominantly single strand breaks) or inthe presence of about 1 to 10 mM Mn²⁺ ions (predominantly double strandbreaks).

DNA that has been enzymatically fragmented in the presence of Mn²⁺ hasbeen demonstrated to have either blunt ends or 1 to 2 bp overhangs.Thus, it is possible to omit the repair step and proceed directly toligation of adaptors. Alternatively, the 3′ ends can be repaired so thata higher plurality of ends are blunt, resulting in improved ligationefficiency. This procedure is carried out by incubating the DNAfragments with a DNA polymerase containing both 3′ exonuclease activityand 3′ polymerase activity, such as Klenow or T4 DNA polymerase. Forexample, incubation of the DNA fragments with Klenow in the presence of40 nmol dNTP and 1× T4 DNA ligase buffer results in optimal productionof blunt end molecules with competent 3′ ends, although modifications ofthe reaction parameters by one of skill in the art are well within thescope of the invention.

Alternatively, Exonuclease III and T4 DNA polymerase can be utilized toremove 3′ blocked bases from recessed ends and extend them to form bluntends (see U.S. Pat. No. 6,197,557, incorporated by reference herein inits entirety). An additional incubation with T4 DNA polymerase or Klenowmaximizes production of blunt ended fragments with 3′ ends that arecompetent to undergo ligation to the adaptor.

DNA that has been enzymatically digested with DNAse I in the presence ofMg²⁺ has been demonstrated to have single stranded nicks. Denaturationof this DNA would result in single stranded DNA fragments of random sizeand distribution. In order to generate double stranded 3′ ends, a fillin reaction with random primers and DNA polymerase that has 3′-5′exonuclease activity, such as Klenow, T4 DNA polymerase, or DNApolymerase I, is performed. Use of these enzymes will result in the sametypes of products as described in item b. above—Repair of ChemicallyFragmented DNA.

4. Sequence Attachment to the Ends of DNA Fragments

The following ligation procedure is designed to work with bothmechanically and chemically fragmented DNA that has been successfullyrepaired and comprises blunt double stranded 3′ ends. Under optimalconditions, the repair procedures will result in the majority ofproducts having blunt ends. However, due to the competing 3′ exonucleaseactivity and 3′ polymerization activity, there will also be a portion ofends that have a 1 bp 5′ overhang or a 1 bp 3′ overhang, for example.Therefore, there are three types of adaptors that can be ligated to theresulting DNA fragments to maximize ligation efficiency, and preferablythe adaptors are ligated to one strand at both ends of the DNAfragments. These three adaptors are illustrated in FIG. 32 and includethe following: blunt end adaptor, 5′ N overhang adaptor, and 3′ Noverhang adaptor. The combination of these 3 adaptors has beendemonstrated to increase the ligation efficiency compared to any singleadaptor. These adaptors are comprised of two oligos, 1 short and 1 long,that are hybridized to each other at some region along their length. Ina specific embodiment, the long oligo is a 20-mer that will be ligatedto the 5′ end of fragmented DNA. In another specific embodiment, theshort oligo strand is a 3′ blocked 11-mer complementary to the 3′ end ofthe long oligo. A skilled artisan recognizes that the length of theoligos that comprise the adaptor may be modified, in alternativeembodiments. For example, a range of oligo length for the long oligo isabout 18 bp to about 100 bp, and a range of oligo length for the shortoligo is about 7 bp to about 20 bp. Furthermore, the structure of theadaptors has been developed to minimize ligation of adaptors to eachother via at least one of three means: 1) lack of a 5′ phosphate groupnecessary for ligation; 2) presence of about a 7 bp 5′ overhang thatprevents ligation in the opposite orientation; and/or 3) a 3′ blockedbase preventing fill-in of the 5′ overhang. The ligation of a specificadaptor is detailed in FIG. 42.

A typical ligation procedure involves the incubation of 1 to 100 ng ofDNA in 1× T4 DNA ligase buffer, 10 pmol of each adaptor, and 400 Unitsof T4 DNA Ligase. Ligations are performed at 16° C. for 1 hour, followedby inactivation of the ligase at 75° C. for 15 minutes. The products ofligation can be stored at −20° C. to 4° C. until amplification.

5. Extension of the 3′ End of the DNA Fragment to Fill in the UniversalAdaptor

Due to the absence of a phosphate group at the 5′ end of the adaptor,only one strand of the adaptor (3′ end) will be covalently attached tothe DNA fragment. A 72° C. extension step is performed on the DNAfragments in the presence of 1× DNA polymerase, 1× PCR Buffer, 200 μM ofeach dNTP, and 1 uM universal primer. This step may be performedimmediately prior to amplification using Taq polymerase, or may becarried out using a thermo-labile polymerase, such as if the librariesare to be stored for future use. The ligation and extension steps aredetailed in FIG. 42.

6. One-Step Adaptor Attachment Method

In a specific embodiment, the amplification reaction comprises about 1to about 5 ng of template DNA, universal primer T7-C₁₀ (SEQ ID NO: 36),Taq polymerase, 1× polymerase buffer, and 200 μM of each dNTP. Inaddition, fluorescein calibration dye (FCD) and SYBR Green I (SGI) maybe added to the reaction to allow monitoring of the amplification usingreal-time PCR by methods well known in the art. PCR is carried out usinga 2-step protocol of 94° C. 15 seconds, 65° C. 2 minutes for the optimalnumber of cycles. Optimal cycle number is determined by analysis of DNAproduction using either real-time PCR or spectrophotometric analysis.Typically, about 5 to about 15 μg of amplified DNA can be obtained froma 25 to 75 μl reaction using optimized conditions. The presence of theshort oligo from the adaptor does not interfere with the amplificationreaction due to its low melting temperature and the blocked 3′ end thatprevents extension.

7. Amplification of DNA Fragments Using the Universal Primer

In a specific embodiment, the amplified DNA from both restriction enzymedigested libraries and control libraries are digested with the samemethylation-sensitive restriction endonuclease used in the firstdigestion, such as Not I. The digestion reaction contains 50 ng to 5 μgof amplified DNA, 1× reaction buffer, and 1 to 25 Units of Not Irestriction endonuclease. The mixture is incubated at 37° C. for 12 to16 hours to ensure complete digestion. The enzyme is inactivated at 65°C. for 15 minutes, and the sample is purified and resuspended to a finalconcentration of 1 to 50 ng/ul in TE-Lo.

In a specific embodiment, the amplification primer incorporates apoly-cytosine extension that functions to suppress secondary libraryamplification by C10 oligonucleotide alone (SEQ ID NO: 38). Toincorporate the extension and provide optimal priming the library isamplified using a reaction mixture comprised of the universal primerT7-C₁₀ (SEQ ID NO: 36), Taq polymerase, 1× polymerase buffer, and 200 μMof each dNTP is incubated. In addition, fluorescein calibration dye(FCD) and SYBR Green I (SGI) may be added to the reaction to allowmonitoring of the amplification using real-time PCR by methods wellknown in the art. PCR is carried out using a 2-step protocol of 94° C.15 seconds, 65° C. 2 minutes for the optimal number of cycles. Optimalcycle number is determined by analysis of DNA production using eitherreal-time PCR or spectrophotometric analysis. Typically, about 5 toabout 15 μg of amplified DNA can be obtained from a 25 to 75 μl reactionusing optimized conditions. The presence of the short oligo from theadaptor does not interfere with the amplification reaction due to itslow melting temperature and the blocked 3′ end that prevents extension.

8. Digestion of Amplified Fragments

In a specific embodiment, the amplified DNA from both restriction enzymedigested libraries and control libraries are digested with the samemethylation-sensitive restriction endonuclease used in the firstdigestion, such as Not I. The digestion reaction comprises 50 ng to 5 μgof amplified DNA, 1× reaction buffer, and 1 to 25 Units of Not Irestriction endonuclease. The mixture is incubated at 37° C. for 12 to16 hours to ensure complete digestion. The enzyme is inactivated at 65°C. for 15 minutes and the sample is purified and resuspended to a finalconcentration of 1 to 50 ng/ul in TE-Lo.

9. Sequence Attachment to the Ends of DNA Fragments

The following ligation procedure is designed to work with DNA that hasbeen digested with restriction endonucleases resulting in ends witheither 5′ overhangs, 3′ overhangs, or blunt ends. Under optimalconditions, the digestion procedure will result in the majority ofproducts having ends competent for ligation. The adaptor is comprised oftwo oligos, 1 short and 1 long, that are hybridized to each other atsome region along their length. In a specific embodiment, the long oligois a 16-mer that will be ligated to the 5′ end of fragmented DNA. Inanother specific embodiment, the short oligo strand is a 3′ blocked14-mer that contains a 4 bp 5′ overhang that is complementary to the 3′overhang generated by the restriction endonuclease Not I. A skilledartisan recognizes that the length of the oligos that comprise theadaptor may be modified, in alternative embodiments. For example, arange of oligo length for the long oligo is about 15 bp to about 100 bp,and a range of oligo length for the short oligo is about 7 bp to about20bp. In addition, the structure of the adaptor is based on the type ofend generated by the restriction endonuclease. A 3′ overhang on the longadaptor will be used for restriction endonucleases that result in a 5′overhang and a blunt end adaptor will be utilized with enzymes thatproduce blunt end molecules. The preferred method will utilizerestriction enzymes that result in either 5′ or 3′ overhangs.Furthermore, the structure of the adaptors has been developed tominimize ligation of adaptors to each other via at least one of threemeans: 1) absence of a 5′ phosphate group necessary for ligation; 2)presence of about a 7 bp 5′ overhang that prevents ligation in theopposite orientation; and/or 3) presence of a 3′ blocked base preventingfill-in of the 5′ overhang. The ligation of a specific adaptor isdetailed in FIG. 42.

A typical ligation procedure involves the incubation of 1 to 100 ng ofDNA in 1× T4 DNA ligase buffer, 10 pmol of each adaptor, and 400 Unitsof T4 DNA Ligase. The ligations are performed at 16° C. for 1 hour,followed by inactivation of the ligase at 75° C. for 15 minutes. Theproducts of ligation can be stored at −20° C. to 4° C. untilamplification.

10. Extension of the 3′ End of the DNA Fragment to Fill in the UniversalAdaptor

Due to the absence of a phosphate group at the 5′ end of the adaptor,only one strand of the adaptor (3′ end) will be covalently attached tothe DNA fragment. A 72° C. extension step is performed on the DNAfragments in the presence of DNA polymerase, 1× PCR Buffer, 200 μM ofeach dNTP, and 1 uM universal primer. This step may be performedimmediately prior to amplification using Taq polymerase, or may becarried out using a thermo-labile polymerase, such as if the librariesare to be stored for future use. The ligation and extension steps aredetailed in FIG. 42.

11. Site-Specific Amplification f Selected Molecules

A site-specific amplification reaction is performed in order to amplifyonly those molecules that contain the second universal primer. Onlymolecules that were cut during the second restriction digest will havehad the second universal adaptor attached. Furthermore, it is predictedthat the majority of these molecules will have the second universaladaptor at only 1 end. In order to amplify these fragments, the seconduniversal primer is utilized in conjunction with a poly C primer toselectively amplify those molecules comprising either the seconduniversal priming site at both ends or the first amplification primingsite at one end and the second universal priming site at the other end.The poly C primer is unable to amplify molecules that contain the firstuniversal priming site at both ends (see, for example, U.S. patentapplication Ser. No. 10/293,048, filed Nov. 13, 2002, now U.S. Pat. No.7,655,791; U.S. patent application Ser. No. 10/797,333, filed Mar. 8,2004, published as U.S. Patent Application Publication No.: 2004/0209299and is now abandoned; U.S. patent application Ser. No. 10/795,667, filedMar. 8, 2004, now U.S. Pat. No. 7,718,403). In a specific embodiment,the amplification reaction comprises about 1 to 5 ng of template DNA,Taq polymerase, 1× polymerase buffer, 200 μM of each dNTP, and 1 uM eachof universal primers K_(U) and C₁₀ (SEQ ID NO: 15 and SEQ ID NO: 38,respectively). In addition, fluorescein calibration dye (FCD) and SYBRGreen I (SGI) may be added to the reaction to allow monitoring of theamplification using real-time PCR by methods well known in the art. PCRis carried out using a 2-step protocol of 94° C. 15 seconds, and 65° C.2 minutes for the optimal number of cycles. Optimal cycle number isdetermined by analysis of DNA production using either real-time PCR orspectrophotometric analysis. Typically, about 5 to 15 μg of amplifiedDNA can be obtained from a 25 to 75 μl reaction using optimizedconditions. The presence of the short oligo from the adaptor does notinterfere with the amplification reaction due to its low meltingtemperature and the blocked 3′ end that prevents extension.

12. One-Step Adaptor Attachment Method

In a specific embodiment, a one-step process utilizing a dU-HairpinAdaptor method described in Example 33, 38, and 39 can be used forattachment of the universal adaptor.

In a specific embodiment, attachment of such an adaptor comprisesproviding in a single reaction the following: a double stranded DNAmolecule; an adaptor, which may be referred to as an oligonucleotide,comprising an inverted repeat with a non base-paired loop; DNApolymerase comprising 3′-5′ exonuclease activity; DNA polymerasecomprising 5′-3′ polymerase activity (and these polymerase activitiesmay be comprised on the same molecule or on different molecules); DNAligase; dNTPs; and ATP, under conditions wherein the adaptor becomesblunt-end ligated to one strand of the DNA molecule, thereby producingan adaptor-ligated DNA molecule comprising a nick having a 3′ hydroxylgroup, wherein there is polymerization from the 3′ hydroxyl group of atleast part of the adaptor-ligated DNA molecule. Such a method may befurther defined as comprising the following actions: producing bluntends of the DNA molecule; producing blunt ends of the adaptor; andligating the blunt end of the adaptor to a blunt end of the DNAmolecule, thereby generating a nick in the adaptor-ligated DNA molecule.

In a specific aspect of this embodiment, polymerization of theadaptor-ligated DNA molecule excluding at least part of the invertedrepeat is further defined as subjecting the adaptor-ligated DNA moleculeto nick translation.

The adaptor may further comprise a non-replicable base or region andwherein polymerization ceases at said non-replicable base or region, andthe non-replicable base or region may be present in the loop of theadaptor. In specific embodiments, the non-replicable base or regioncomprises deoxy-uracil (dU) or hexaethylene glycol.

In some aspects of this embodiment, the polymerization of theadaptor-ligated DNA molecule generates an endonuclease site, such as asite-specific restriction endonuclease site and wherein at least part ofthe inverted repeat is removed by cleavage with said restrictionendonuclease, for example. In specific embodiments, the restrictionendonuclease is Eco NI or Bst UI. In another specific embodiment, theloop of the adaptor comprises about 3 dU nucleotides and wherein theendonuclease is apurunic/apyrimidinic endonuclease (APE-endonuclease).

In further specific embodiments, the single reaction is further definedas occurring at one temperature. In other specific embodiments, theadaptor is removed by 5′ exonuclease. In specific embodiments, a 5′ endof the adaptor lacks a phosphate.

In further embodiments, the prepared molecule is subjected toamplification and may comprise polymerase chain reaction, for example.The prepared molecule may be subjected to cloning.

In another specific aspect of this embodiment, attaching the adaptorcomprises the step of providing in a single reaction the following: adouble stranded DNA molecule; an adaptor comprising an inverted repeatand a loop, said loop comprising about 6-10 nucleotides; DNA polymerasecomprising 3′-5′ exonuclease activity; DNA polymerase comprising 5′endonuclease activity (activities that may or may not be on the samemolecule); DNA ligase; dNTPs; and ATP, under conditions wherein theadaptor becomes blunt-end ligated to one strand of the DNA molecule,thereby producing an adaptor-ligated DNA molecule comprising a nickhaving a 3′ hydroxyl group, wherein there is polymerization from the 3′hydroxyl group of at least part of the adaptor-ligated DNA molecule.

In a specific embodiment, the DNA molecule comprises two or more abasicsites, such as wherein the DNA molecule is subjected to apurinizationwith low pH and high temperature. In other specific embodiments, themethod comprises subjecting the adaptor-ligated DNA molecule topolymerase chain reaction, wherein the polymerase chain reactionutilizes said adaptor as a primer.

13. Site-Specific Amplification of Selected Molecules

A site-specific amplification reaction is performed in order to amplifyonly those molecules that comprise the second universal primer. Onlymolecules that were cut during the second restriction digest will havehad the second universal adaptor attached. Furthermore, it is predictedthat the majority of these molecules will have the second universaladaptor at only 1 end. In order to amplify these fragments, the seconduniversal primer is utilized in conjunction with a poly C primer toselectively amplify those molecules comprising either the seconduniversal priming site at both ends or the first amplification primingsite at one end and the second universal priming site at the other end.The poly C primer is unable to amplify molecules that comprise the firstuniversal priming site at both ends (see, for example, U.S. patentapplication Ser. No. 10/293,048, now U.S. Pat. No. 7,655,791; U.S.patent application Ser. No. 10/795,667, filed Mar. 8, 2004; now U.S.Pat. No. 7,718,403, U.S. patent application Ser. No. 10/797,333, filedMar. 8, 2004, published as U.S. Patent Application Publication No.:2004/0209299 and is now abandoned). In a specific embodiment, theamplification reaction comprises about 1 to 5 ng of template DNA, Taqpolymerase, 1× polymerase buffer, 200 μM of each dNTP, and 1 uM each ofuniversal primers K_(U) and C₁₀ (SEQ ID NO: 15 and SEQ ID NO: 38,respectively). In addition, fluorescein calibration dye (FCD) and SYBRGreen I (SGI) may be added to the reaction to allow monitoring of theamplification using real-time PCR by methods well known in the art. PCRis carried out using a 2-step protocol of 94° C. 15 seconds, and 65° C.2 minutes for the optimal number of cycles. Optimal cycle number isdetermined by analysis of DNA production using either real-time PCR orspectrophotometric analysis. Typically, about 5 to 15 μg of amplifiedDNA can be obtained from a 25 to 75 μl reaction using optimizedconditions. The presence of the short oligo from the adaptor does notinterfere with the amplification reaction due to its low meltingtemperature and the blocked 3′ end that prevents extension.

14. Analysis of Amplified Products to Determine Methylation Status

The amplified DNA products are analyzed using real-time, quantitativePCR using markers that are adjacent to Not I restriction sites. A panelof 14 typical and exemplary markers is listed in Table II. In a specificembodiment, 25 μl reactions were amplified for 40 cycles at 94° C. for15 seconds and 65° C. for 1 minute. Standards corresponding to 10, 1,and 0.2 ng of fragmented DNA were used for each marker while samples aretested at multiple dilutions, typically 1:10 to 1:1000, to ensure thatthey amplify within the boundaries of the standard curve. Quantities arecalculated by standard curve fit for each marker and are plotted ashistograms. All markers should be successfully amplified in the controlDNA. Markers that are present in the restriction enzyme digested sampleare considered to be sites that were methylated in the originalmolecule.

TABLE II HUMAN MARKERS USED FOR METHYLATION ANALYSIS BYQUANTITATIVE REAL-TIME PCR Accession #* #** Forward PrimerReverse Primer 21 AJ322533 GAAACCCCTCAGCAACCTACC GCCCTTCATCCCGTATCACTT(SEQ ID NO: 69) (SEQ ID NO: 70) 22 AJ322546 CATCAGGAATGTGGAAGTCGGTGCTGCGGTGACAGTGTGA (SEQ ID NO: 71) (SEQ ID NO: 72) 23 AJ322610AGCCTGACGGAGAACATCTGG GCCTGAGGTCACTGAGGTTGG (SEQ ID NO: 73)(SEQ ID NO: 74) 26 AJ322559 TGGCTCCTGAAATCAGACCTG GATTGTGTGGGTGTGAGTGGG(SEQ ID NO: 75) (SEQ ID NO: 76) 27 AJ322568 CGTCCACACCCTCCAACCACCGCAGGAAACACAGACCAAAC (SEQ ID NO: 77) (SEQ ID NO: 78) 28 AJ322570CTGGTCGCAGATTGGTGACAT GGCAAAAATGCAGCATCCTA (SEQ ID NO: 79)(SEQ ID NO: 80) 29 AJ322572 CCTTGTCAGGATGGCACATTG CCGTCTCACACGCACCCTCT(SEQ ID NO: 81) (SEQ ID NO: 82) 31 AJ322623 GCAATACGCTCGGCAATGACCGGGTAAGGAGGTGGGAACAC (SEQ ID NO: 83) (SEQ ID NO: 84) 35 AJ322781GTCAACCCAGCCTGTGTCTGA GGATGGTCACCCTGTTGGAG (SEQ ID NO: 85)(SEQ ID NO: 86) 36 AJ322715 GCTGAGGTTCGGCAAGTCTCC AGCCCCCAGTTCCTTTCAATC(SEQ ID NO: 87) (SEQ ID NO: 88) 37 AJ322747 ACCAGGCACATGAGACAAGGAGGGCACCTGCTGTGACTTCT (SEQ ID NO: 89) (SEQ ID NO: 90) 38 AJ322801CGAGAAATTCCCGAAACGAGA GCCCCTTGAGAATACCTTGCT (SEQ ID NO: 91)(SEQ ID NO: 92) 44 AJ322670 GCAGAGCAAATTCGGGATTC CGGCTGAACTGATTCGGAAGT(SEQ ID NO: 93) (SEQ ID NO: 94) 46 AJ322761 GCGTTCTCAACTGCGATTCCTGCCCTTCCTGTGAAAGCACT (SEQ ID NO: 95) (SEQ ID NO: 96) *Omittedsequential numbers indicate dropped sequences that did not amplify wellin quantitative RT-PCR. **Accession number of marker sequences fromGENBANK ®. Sequences of the regions from which the primers were designedcan be found in the nucleotide database at the National Center forBiotechnology Information′s website.

TABLE III METHYLATION-SENSITIVE RESTRICTION ENZYMES WHERECLEAVAGE IS BLOCKED AT ALL METHYLATED SITES.POTENTIAL METHYLATION SITES (CG) ARE IN BOLD CAPITALS Enzyme SequenceEnzyme Sequence Aat II gaCGtc Nae I gcCGgc Aci I cCGc Nar I ggCGcc Acl IaaCGtt NgoM IV gcCGgc Afe I agCGct Not I gCGgcCGc Age I acCGgt Nru ItCGCGa Asc I ggCGCGcc PaeR7 II ctCGag AsiS I gCGatCGc Pml I caCGtg Ava IcyCGrg Pvu I CGatCG BceA I aCGgc Rsr II CGgwcCG BmgB I caCGtc Sac IIcCGCGg BsaA I yaCGtr Sal I gtCGac BsaH I grCGyc Sfo I ggCGcc BsiE ICGryCG SgrA I crcCGgyg BsiW I CGtaCG Sma I ccCGgg BsmB I CGtctc SnaB ItaCGta BspD I atCGat Til II ctCGag BspE II tcCGga Xho II ctCGag BsrB IIcCGctc Enzyme Sequence BsrF I rcCGgy BssH II gCGCGc BstB I ttCGaa BstU ICGCG Cla I atCGat Eag I CGgcCG Fau I ccCGc Fse I ggcCGgcc Fsp I tgCGcaHae II rgCGcy Hga I gaCGc Hha I gCGc HinP1 I gCGc Hpa II cCGg Hpy99 ICGwCG HpyCH4 IV aCGt Kos I ggCGcc Mlu I aCGCGtD. Methylation Analysis Method Using Methylome Libraries Constructedfrom DNA Digested with Frequently Cutting Methylation-SensitiveRestriction Enzymes or Libraries Subjected to Similar Digestion AfterConstruction

In this embodiment, there are methods of preparing a library of DNAmolecules to select for sequences that comprise recognition sites formethylation-sensitive restriction enzymes in regions of high GC contentsuch as promoter CpG islands (FIGS. 33A, 33B, and 33C). After digestionof DNA with one (FIG. 33A) or a mixture of several, for example, 5 ormore (FIGS. 33B and 33C) frequently cutting (4-5 base recognition site)methylation-sensitive restriction enzymes, a Methylome library isprepared by incorporating a universal sequence using primers comprisinga universal sequence at their 5′-end and a degeneratenon-self-complementary sequence at their 3′-end in the presence of DNApolymerase with strand-displacement activity (U.S. patent applicationSer. No. 10/795,667, filed Mar. 8, 2004, now U.S. Pat. No. 7,718,403).The enzymes used for the DNA cleavage may include (but are not limitedto) such commercially available restriction endonucleases as Aci I, BstUI, Hha I, HinP1, Hpa II, Hpy 99I, Ava I, Bce AI, Bsa HI, Bsi E1, andHga I, for example. The spatial distribution of recognition sites forthese 11 nucleases in the human genome closely mimics the distributionof the CpG dinucleotides, with their density being especially high inmany CpG-rich promoter regions (FIGS. 33D and 33E). As a result ofcleavage, non-methylated CpG-rich regions such as gene promoters innormal cells are digested to very short fragments (FIG. 33B) whilemethylated CpG regions such as hypermethylated gene promoters in cancercells remain intact (FIG. 33C). The Methylome DNA library may next beamplified in a PCR reaction with a primer comprising the universalsequence and a thermo-stable DNA polymerase. In the process of Methylomelibrary synthesis and subsequent amplification, only those DNA moleculesprotected from cleavage by CpG methylation will amplify, whereasnon-methylated DNA molecules are efficiently cleaved into smallfragments that fail to be efficiently primed or converted into libraryamplicons. The digestion of non-methylated CpG regions results in a gapor loss of representation of these sequences in primary Methylomelibraries, FIGS. 33A and 33B, and FIG. 51, Example 18). The presence ofa specific DNA region encompassing a site or a group of sites in thefinal amplified Methylome libarary indicates that the CpG contained inthe methylation-sensitive restriction site or a group of sites wasmethylated in the DNA template. The methylation status of any particularCpG site may be analyzed by any of a number of specific analyticalmethods known in the art, such as quantitative real-time PCR, LCR,ligation-mediated PCR, probe hybridization, probe amplification,microarray hybridization, a combination thereof, or other suitablemethods in the art (FIG. 34 and FIG. 35, for example). Furthermore, useof control DNA that is not digested by the restriction enzyme during theinitial library preparation (Whole Genome library) will allow validationof the selection of each site during library preparation andamplification.

In one specific embodiment (such as is described in Example 28), thereis a method for improving the restriction enzyme cleavage efficiency bypre-heating genomic DNA at 85° C., and specifically as it pertains tocleavage by the restriction enzyme Acil within the GC-rich promoterregions. GC-rich DNA sequences, through interactions with proteins, mayform alternative (non-Watson-Crick) DNA conformation(s) that are stableeven after protein removal and DNA purification. These putative DNAstructures could be resistant to restriction endonuclease cleavage andaffect the performance of the methylation assay. Heating DNA at anelevated temperature (but not too high to melt the DNA) reduces theenergetic barrier and accelerates the transition of DNA from anon-canonical form to a classical Watson-Crick structure.

In a second specific embodiment, there are methods for preparing asecondary library of DNA molecules from the amplification products ofthe primary Methylome library in such a way as to enrich for only thosesequences that are between methylated restriction endonuclease sitespresent in the primary library. An outline of this method is detailed inExample 22 and is depicted in FIGS. 43A 43B. Following amplification ofthe primary methylation library, all of the previously methylatedrestriction endonuclease recognition sites are converted to unmethylatedsites. Digestion of these molecules with the same restrictionendonuclease utilized in construction of the primary library will resultin cleavage of these sites (FIG. 43A). Following cleavage, a mixture of2 or more secondary adaptors may be ligated to the resulting newlycleaved ends. Two or more secondary adaptors are utilized to allowamplification of small molecules that would not be amplified, due to PCRsuppression, if the same adaptor sequence was present at both ends.Amplification with primers complementary to these secondary adaptorswill only amplify those molecules that contain the secondary adaptors atboth ends. These amplimers will correspond to sequences betweenrecognition sites that were originally methylated in the startingmaterial (secondary Methylome library). In a preferred embodiment(Example 29), there is a demonstration of the utility of the Methylomelibrary prepared from DNA digested with a mixture of severalmethylation-sensitive restriction enzymes for analysis of themethylation status of promoter regions for 24 exemplary genes inleukemia cell line DNA. The invention employs the use of five exemplarymethylation-sensitive restriction enzymes, specifically, Aci I, Bst UI,Hha I, HinP1 I, and Hpa II, to convert intact non-methylated CpG-richpromoter regions into restriction fragments that fall below the minimumlength competent for amplification by degenerate primary Methylomelibrary. Secondary Methylome libraries are subsequently prepared using amixture of several (5 or more) methylation-sensitive restrictionenzymes, the secondary library can be prepared by mixing together theproducts of several individual restriction digests of the primaryMethylome library (using individually the same restriction endonucleasesthat have been utilized in the primary library nuclease cocktail),ligating secondary adaptors, and amplifying with universal primer wholegenome amplification (WGA) method (see, for example, U.S. patentapplication Ser. No. 10/293,048, filed Nov. 13, 2002, now U.S. Pat. No.7,655,791), while methylated CpG-rich promoter regions resistant todigestion are efficiently amplified specific to the secondary adaptors(FIG. 43B). Analysis of these amplicons can be carried out by PCR,microarray hybridization, probe assay, capillary electrophoresis,sequencing, or other methods known in the art (Example 18, FIG. 34 andFIG. 35), for example. Sequencing of these products can provide a toolfor discovering regions of methylation not previously characterized, asno a priori knowledge of the sequences is required and the reducedcomplexity of the enriched secondary library allows analysis of a smallnumber of methylated regions.

The importance of implementation of multiple methylation-sensitiverestriction enzymes in methylome library preparation stems from theanalysis of promoter regions in the human genome. The spatialdistribution of methylation-sensitive restriction sites that includerestriction endonucleases with 4 and 5 base recognition sites such asAci I, Bst UI, Hha I, HinP1 I, Hpa II, Hpy 99I, Hpy CH4 IV, Ava I, BceAI, Bsa HI, Bsi El. A specific method for the analysis of this reducedcomplexity secondary methylation library is presented in Example 23 andFIG. 44. Briefly, the number of molecules present in the secondarylibrary is a function of the number of methylated CpG islands in thegenome, and the average number of methylation-specific restrictionendonuclease sites within each island. For example, if 1% of theapproximately 30,000 CpG islands are hypermethylated, and there are 5Hpa II restriction sites per CpG island, then there would beapproximately 1,200 amplified fragments present in the secondarylibrary. Amplification of this library with a mixture of 4 (A) primersand Hga I closely mimics the distribution of the CpG dinucleotides inthese regions. When DNA is incubated with a single methylation sensitiveenzyme the resulting digestion is incomplete with many restriction sitesremaining uncut. Factors contributing to this phenomenon are likely theextremely high GC-content and potential for alternative secondarystructure. As a result, DNA pre-treated with one restriction enzyme maystill contain substantial amounts of uncut non-methylated sites.Co-digestion of DNA with a cocktail of 5 or more methylation-sensitiverestriction enzymes results in efficient conversion of allnon-methylated CpG island into very small DNA fragments while leavingcompletely methylated CpG regions intact. Subsequently, whole genomeamplification (WGA) of DNA pre-treated with the restriction enzymecocktail using universal K_(U) primer (SEQ ID NO: 15) results inamplification of all DNA regions except the CpG- and restrictionsite-rich regions that were not methylated in the original DNA. Theseregions are digested into fragments that fail to amplify using therandom-primed WGA method. Multiple-enzyme-mediated depletion ofnon-methylated promoter regions in the amplified methylome library is soefficient that non-methylated CpG-rich regions can not be detected byPCR. Those regions encompassing densely methylated CpG islands are notaffected by the enzyme cocktail treatment and are efficiently amplifiedby the WGA process and can be later easily detected and quantified byreal-time PCR.

The presence of methylated DNA within 24 cancer gene promoters wasanalyzed by quantitative real-time PCR using amplified libraries and apanel of 40 specific primer pairs. Primers were designed to test thelibraries for amplicons spanning CpG-rich regions within promoters. Thepresence or absence of amplification for specific sequences that displaya high frequency of potential cleavage sites was indicative of themethylation status of the promoter. Initially a set of 24 promotersfrequently implicated in different types of cancer were evaluated. Theexemplary primer pairs used in the PCR assays are listed in Table IV.

In a third specific embodiment (such as is described in Example 40),there is an analysis of sensitivity of the methylation assay thatinvolves preparation of Methylome libraries by multiple (five)methylation-sensitive restriction enzyme cleavage. The analysis useslibraries prepared by incorporation of universal sequence andamplification with self-inert K_(U) primer (SEQ ID NO: 15) of DNA fromprostate cancer cell line LNCaP mixed with normal non-methylated DNA indifferent ratios. FIG. 65 shows the threshold cycle (Ct) differencebetween cut and uncut methylome libraries from real time PCR for threepromoter primer pairs with various percentages of prostate cell line(LNCaP) DNA in the libraries. Both the APC1-3 and GSTP1-1 gene promoterregion primers demonstrated the presence of target promoter DNA, andthus protection from methylation-sensitive restriction enzymes cuttingwith as little as 1% or less of cancer cell line DNA present suggestinga sensitivity detection limit of at least 99%.

In another embodiment, there are methods for preparing a secondarylibrary of DNA molecules from the amplification products of the primaryMethylome library in such a way as to enrich for only those sequencesthat are between methylated restriction endonuclease sites present inthe primary library. An outline of this method is detailed in Example 22and is depicted in FIGS. 43A and 43B. Following amplification of theprimary methylation library, all of the previously methylatedrestriction endonuclease recognition sites are converted to unmethylatedsites. Digestion of these molecules with the same restrictionendonuclease utilized in construction of the primary library will resultin cleavage of these sites (FIG. 43A). Following cleavage, a mixture of2 or more secondary adaptors is ligated to the resulting cleaved ends.Two or more secondary adaptors are utilized to allow amplification ofsmall molecules that would not be amplified, due to suppression, if thesame adaptor sequence was present at both ends. Amplification withprimers complementary to these secondary adaptors will only amplifythose molecules that contain the secondary adaptors at both ends. Theseamplimers will correspond to sequences between recognition sites thatwere originally methylated in the starting material (secondary Methylomelibrary).

In a preferable situation a one-step library preparation processutilizing a dU-Hairpin Adaptor method described in Example 33, 38, and39 can be used for preparation of secondary Methylome libraries. In thiscase, two hairpin oligonucleotides with different sequence should beused to avoid the PCR suppression effect that is known to inhibitamplification of very short DNA amplicons with one universal sequence atthe end.

In a preferable case when primary Methylome library is prepared by usinga mixture of several (5 or more) methylation-sensitive restrictionenzymes the secondary library can be prepared by mixing together,ligating adaptors, and amplifying the products of several individualrestriction digests of the primary Methylome library using the samerestriction endonucleases that have been utilized in the nucleasecocktail (FIG. 43B). Analysis of these amplicons can be carried out byPCR, microarray hybridization, probe assay, capillary electrophoresis,sequencing, or other methods known in the art (Example 18, FIG. 34 andFIG. 35), for example. Sequencing of these products can provide a toolfor discovering regions of methylation not previously characterized, asno a priori knowledge of the sequences is required and the reducedcomplexity of the enriched secondary library allows analysis of a smallnumber of methylated regions.

In one specific embodiment (Example 30), there is a preparation andlabeling of secondary Methylome library for microarray analysis and ademonstration of its 16-128-fold enrichment in the copy number forseveral methylated CpG promoters compared to the primary Methylomelibrary. Libraies were prepared from the prostate cancer cell line LNCaP(Coriell Institute for Medical Research) and from DNA isolated fromperipheral blood of a healthy donor. The Methylome library was preparedby using five methylation-sensitive restriction enzymes, specifically,Aci I, Bst UI, Hha I, HinP1 I, and Hpa II, for example, and degenerateprimer whole genome amplification (WGA) method (see, for example, U.S.patent application Ser. No. 10/293,048, filed Nov. 13, 2002, now U.S.Pat. No. 7,655,791), and amplified using universal K_(U) primer (SEQ IDNO: 15).

The distribution of promoter sites and the level of their enrichment inamplified secondary methylome libraries from cancer DNA was analyzed byquantitative PCR using primer pairs amplifying short amplicons that donot contain recognition sites for at least two of themethylation-sensitive restriction enzymes employed in the presentexample (Table V, SEQ ID NOS: 190 through SEQ ID NO: 197). Mechanicallyfragmented genomic DNA from the peripheral blood of a healthy donor wasused as a control for relative copy number evaluation.

FIG. 66 shows typical amplification curves of four promoter sites, threeof which, GSTP-1, RASSF-1, and CD44 are methylated, and one, p16, is notmethylated in LNCaP cell line DNA. For methylated promoters, between a 4and 7 cycle leftward shift (enrichment of between 16 and 128-fold) ofthe amplification curves is observed from the secondary methylomelibrary relative to the curve corresponding to control non-amplifiedgenomic DNA. For the non-methylated p16 promoter, a curve delayedapproximately 4 cycles relative to the control appeared. However, thiscurve did not correspond to the correct size amplicon and was mostlikely a product of mis-priming.

A specific method for the analysis of this reduced complexity secondarymethylation library is presented in Example 23 and FIG. 44. Briefly, thenumber of molecules present in the secondary library is a function ofthe number of methylated CpG islands in the genome, and the averagenumber of specific methylation-specific restriction endonuclease siteswithin each island. For example, if 1% of the approximately 30,000 CpGislands are hypermethylated, and there are 5 Hpa II restriction sitesper CpG island, then there would be approximately 1,200 amplifiedfragments present in the secondary library. Amplification of thislibrary with a mixture of 4 A primers and 4 B primers, each containing a3′ selector nucleotide, would result in 16 possible combinations ofprimers for amplification. Thus, the 1,200 amplified fragments would bedivided between 16 reactions, resulting in approximately 75 fragmentsper reaction. Capillary electrophoresis of each reaction would allow forthe resolution of these 75 products and the patterns of methylated CpGislands could be resolved. Additional sequencing reactions could beperformed to identify the specific bands of interest from within eachmixture.

1. Choice of Restriction Enzymes

Methylation-sensitive restriction enzymes with recognition sitescomprising the CpG dinucleotide and no adenine or thymine are expectedto cut genomic DNA with much lower frequency as compared to theircounterparts having recognition sites with normal GC to AT ratios. Thereare two reasons for this. First, due to the high rate of methyl-cytosineto thymine transition mutations, the CpG dinucleotide is severelyunder-represented and unequally distributed across the human genome.Large stretches of DNA are depleted of CpG's and thus do not containthese restriction sites. Second, most methylated cytosine residues arefound in CpG dinucleotides that are located outside of CpG islands,primarily in repetitive sequences. Due to methylation, these sequenceswill also be protected from cleavage. On the other hand, about 50 to 60%of the known genes contain CpG islands in their promoter regions andthey are maintained largely unmethylated, except in the cases of normaldevelopmental gene expression control, gene imprinting, X chromosomesilencing, or aberrant methylation in cancer and some other pathologicalconditions. These CpG islands are digested by the methylation-sensitiverestriction enzymes in normal gene promoter sites but not in aberrantlymethylated promoters. Four base GC recognition restriction enzymes asexemplified by Aci I, BstU I, Hha I, HinP1 I, and Hpa II withrecognition sites CCGC, CGCG, GCGC, and CCGG, respectively (Table III),are particularly useful since they will frequently cut non-methylatedDNA in CpG islands, but not methylated DNA, and as exemplified herein,can be used as a 5-enzyme mix using optimized buffer conditions.Restriction endonucleases Hpy 99I, Ava I, Bce AI, Bsa HI, Bsi E1, andHga I with 5-base recognition sites can also be used under these bufferconditions thus extending the potential number of restriction enzymes inthe reaction mix (up to 11) and increasing the effective depletion ofnon-methylated CpG-rich DNA template. The spatial distribution ofrecognition sites for these nucleases in the human genome closelyfollows the distribution of the CpG dinucleotides (FIGS. 33D and 33E),with particularly high density in CpG-rich gene promoter regions (CpGislands). A current list of known methylation-sensitive restrictionendonucleases is presented in Table III. As yet undiscovered butpotentially useful enzymes for Methylome library construction would bemethylation-sensitive restriction nucleases having 4-base recognitionsites with two CpG dinucleotides that are separated by one, two, three,or more random bases, such as CGNCG, CGNNCG, CGNNNCG, with a generalformula CG(N)_(m)CG.

2. Restriction Digestion of Target DNA

In a specific embodiment, target DNA is digested with a mix ofmethylation-sensitive restriction endonucleases, such as Aci I, BstU I,Hha I, HinP1 I, and Hpa II, or a compatible combination thereof. Thedigestion reaction usually comprises from 10 ng to 1 μg of genomic DNAin 25-100/μl of 1× NEBuffer (NEB), and about 1 to about 25 units of eachrestriction endonuclease. The mixture is incubated at 37° C. for 2- 18 hfollowed by 2 h at 60° C. to insure complete digestion. Whenappropriate, the enzyme is inactivated at 65° C. to 70° C. for 15minutes and the sample is precipitated and resuspended to a finalconcentration of 1 to 50 ng/μl. In a preferred embodiment digested DNAis directly used for library preparation. Genomic DNA that has not beendigested by the methylation-sensitive enzyme mix may serve as positivecontrol during library preparation and analysis, for example.

3. Library Preparation and Amplification

The described invention utilizes an oligonucleotide primer comprising atleast as the majority of its sequence only two types of nucleotide basesthat can not participate in stable Watson-Crick pairing with each other,and thus can not self-prime (see, for example, U.S. patent applicationSer. No. 10/795,667, filed Mar. 8, 2004, now U.S. Pat. No. 7,718,403).The primers comprise a constant known sequence at their 5′end and adegenerate nucleotide sequence located 3′ to the constant knownsequence. There are four possible two-base combinations known not toparticipate in Watson-Crick base pairing: C-T, G-A, A-C and G-T. Theysuggest four different types of degenerate primers that should not forma single Watson-Crick base pair or create primer-dimers in the presenceof DNA polymerase and dNTPs. These primers are illustrated in FIG. 2 andare referred to as primers Y, R, M and K, respectively, in accordancewith common nomenclature for degenerate nucleotides: Y═C or T, R═G or A,M═A or C and K═G or T.

For example, Y-primers have a 5′ known sequence Y_(U) comprised of C andT bases and a degenerate region (Y)₁₀ at the 3 prime end comprising ten,for example, randomly selected pyrimidine bases C and T. R-primers havea 5′ known sequence R_(U) comprised of G and A bases and a degenerateregion (R)₁₀ at the 3 prime end comprising ten, for example, randomlyselected purine bases G and A. M-primers have a 5′ known sequence M_(U)comprised of A and C bases and a degenerate region (M)₁₀ at the 3 primeend comprising ten, for example, randomly selected bases A and C.Finally, K-primers have a 5′ known sequence K_(U) comprised of G and Tbases and a degenerate region (K)₁₀ at the 3 prime end comprising ten,for example, randomly selected bases G and T. Primers of the describeddesign will not self-prime and thus will not form primer dimers.However, they will prime at target sites comprising the correspondingWatson-Crick base partners, albeit with reduced overall frequencycompared to completely random primers. In specific embodiments, theseprimers under specific conditions are capable of forming primer dimers,but at a greatly reduced level compared to primers lacking suchstructure.

In some embodiments, these primers are supplemented with a completelyrandom (i.e., containing any of the four bases) short nucleotidesequence at their 3′ end. A limited number of completely random basespresent at the 3′ end of the Y, R, M or K primers, increases theirpriming frequency, yet maintains limited ability for self-priming. Byusing a different number of completely random bases at the 3′ end of thedegenerate Y, R, M or K primers, and by carefully optimizing thereaction conditions, one can precisely control the outcome of thepolymerization reaction in favor of the desired DNA product with minimumprimer-dimer formation.

Thus, in the first step referred to as a “library synthesis” step,primers of the described design are randomly incorporated in anextension/polymerization reaction with a DNA polymerase possessingstrand-displacement activity. The resulting branching process createsDNA molecules having known (universal) self-complementary sequences atboth ends. In a second step referred to as the “amplification” step,these molecules are amplified exponentially by polymerase chain reactionusing Taq DNA polymerase and a single primer corresponding to the known5′-tail of the random primers. This process overcomes major problemsknown in the art for DNA amplification by random primers.

Random fragmentation of DNA can be performed by mechanical, chemical, orenzymatic treatment. In a preferred embodiment, DNA is fragmented byheating at about 95° C. in low salt buffers such as TE (10 mM Tris-HCl,1 mM EDTA, having pH between 7.5 and 8.5) or TE-L (10 mM Tris-HCl, 0.1mM EDTA, having pH between 7.5 and 8.5) for between about 1 and about 10minutes (for example, see U.S. Patent Application 20030143599,incorporated by reference herein in its entirety).

In a preferred embodiment of the present invention, a library synthesisreaction is performed in a volume of about 10 to about 25 μl. Thereaction mixture comprises about 100 ng or less of restriction digestedand thermally-fragmented DNA, about 1 μM of self-inert degenerate K(N)₂primer containing G and T bases at the known and degenerate regions and2 completely random 3′ bases (SEQ ID NO: 14), about 4% (v/v) ofdimethylsulfoxide (DMSO), about 200 μM 7-deaza-dGTP (Sigma), betweenabout 2 units and about 10 units of Klenow Exo-DNA polymerase (NEB),between about 5 mM and about 10 mM MgCl₂, about 100 mM NaCl, about 10 mMTris-HCl buffer having pH of about 7.5, and about 7.5 mM dithiothreitol.Preferably, the incubation time of the reaction is between about 60minutes and about 120 minutes and the incubation temperature is about24° C. in an isothermal mode or in another preferred embodiment bysequential isothermal steps at between about 16° C. and about 37° C.

A typical amplification step with universal sequence primer K_(U) (SEQID NO: 15) comprises between about 1 and about 25 ng of library productsand between about 0.3 and about 2 μM of universal sequence primer, about4% DMSO, about 200 μM 7-deaza-dGTP (Sigma), and about 0.5 M betaine(Sigma) in a standard PCR reaction well known in the art, underconditions optimal for a thermostable DNA polymerase, such as Taq DNApolymerase, Pfu polymerase, or derivatives and mixtures thereof.

4. Analysis of Amplified Products to Determine the Methylation Status ofTarget DNA

Aliquots of the amplified library DNA are analyzed for the presence ofCpG sites or regions encompassing more than one such site. This can beachieved by quantitative real-time PCR amplification, comparativehybridization, ligation-mediated PCR, ligation chain reaction (LCR),fluorescent or radioactive probe hybridization, hybridization topromoter microarrays comprising oligonucleotides or PCR fragments, or byprobing microarray libraries derived from multiple samples with labeledPCR or oligonucleotide probes, for example. The magnitude of the signalwill be proportional to the level of methylation of a promoter site.

A typical quantitative real-time PCR-based methylation analysis reactioncomprises 1× Taq polymerase reaction buffer, about 10 to about 50 ng oflibrary DNA, about 200 to about 400 nM of each specific primer, about 4%DMSO, 0 to about 0.5 M betaine (Sigma), 1:100,000 dilutions offluorescein calibration dye (FCD) and SYBR Green I (SGI) (MolecularProbes), and about 5 units of Taq polymerase. PCR is carried out on anI-Cycler real-time PCR system (Bio-Rad) using a cycling protocoloptimized for the respective primer pair and for the size and the basecomposition of the analyzed amplicon.

Preparation of Secondary Methylome Libraries and their Utility forDiscovery of New Methylation Markers

In a specific embodiment the preparation of what may be termed a“Secondary Methylome” library derived from the amplified primaryMethylome library is described.. Secondary libraries are derived bycleavage of the primary library with the same set ofmethylation-sensitive restriction endonucleases used in preparation ofprimary library and subsequent amplification of the excised short DNAfragments. Restriction sites originally methylated in the DNA samplewere refractory to cleavage in the primary library, howeveramplification substituting the 5′-methyl cytosines of the startingtemplate DNA with non-methylated cytosines conveys cleavage sensitivityto these previously protected restriction sites. Incubation of theamplified primary library with the restriction endonuclease set (Aci I,Hha I, HinP1 I, or Hpa II) would have no effect for amplicons lackingthose restriction sites, produce a single break for amplicons with onesite, and release one or more restriction fragments from CpG-richamplicons with two or more corresponding restriction sites. Selectiveligation of adaptors (containing 5′-CG-overhangs complementary to theends of Aci I, Hha I, HinP1 I, and Hpa II restriction fragments, orblunt-end adaptors compatible with the ends of fragments produced by BstUI) and subsequent amplification of the ligation products by PCR resultsin amplification of only those DNA fragments that were originallyflanked by two methylated restriction sites. Secondary Methylomelibraries generated by different restriction enzymes can be mixedtogether to produce a redundant secondary Methylome library containingoverlapping DNA restriction fragments originating from the methylatedCpG islands present in the sample. These libraries are highly enrichedfor methylated sequences and can be analyzed by hybridization to apromoter microarray or by real-time PCR using very short PCR amplicons.

5. Restriction Digestion of Amplification Products from the PrimaryMethylome Library

In specific embodiments, amplified library DNA is digested with the samemethylation-sensitive restriction endonuclease(s) utilized to generate aprimary Methylome library, such as Aci I, BstU I, Hha I, HinP1 I, andHpa II or a combination thereof The digestion reaction contains about0.1 ng to about 10 μg of genomic DNA, 1× reaction buffer, and about 1 toabout 25 units of restriction endonuclease(s). The mixture(s) isincubated at 37° C. or at the optimal temperature of the respectiveendonuclease for about 1 hour to about 16 hours to insure completedigestion. The enzyme(s) is inactivated at 65° C. to 70° C. for 15minutes and the sample is precipitated with ethanol and resuspended to afinal concentration of 1 ng/μl to 50 ng/μl.

In a specific embodiment described in Example 30 primary methylomelibraries prepared from DNA isolated from the prostate cancer cell lineLNCaP or from control peripheral blood DNA of a healthy donor arepre-heated at 80° C. and digested in three separate tubes with themethylation-sensitive enzymes AciI, HpaII, and a mixture of HhaI andHinp1I. Digestion products are pooled, size-fractionated byultrafiltration to select for short products of the secondary cleavageand concentrated by ethanol precipitation.

6. Attachment of Secondary Adaptors

The following ligation procedure is designed to work with DNA that hasbeen digested with restriction endonucleases resulting in ends witheither 5′ overhangs, 3′ overhangs, or blunt ends. Under optimalconditions, the digestion procedure will result in the majority ofproducts having ends competent for ligation. The adaptor is composed oftwo oligonucleotides, 1 short and 1 long, which are hybridized to eachother at some region along their length. A range of length for the shortoligonucleotide is about 7 bp to about 20 bp. In addition, the structureof the adaptor is based on the type of ends generated by the restrictionendonuclease. A 3′ overhang on the long adaptor will be used forrestriction endonucleases that result in a 5′ overhang and a blunt endadaptor will be utilized with enzymes that produce blunt end molecules.The structure of the adaptors has been developed to minimize ligation ofadaptors to each other via at least one of three means: 1) lack of a 5′phosphate group necessary for ligation; 2) presence of about a 7 bp 5′overhang that prevents ligation in the opposite orientation; and/or 3) a3′ blocked base preventing fill-in of the 5′ overhang.

A typical ligation procedure involves the incubation of about 1 to about100 ng of DNA in 1× T4 DNA ligase buffer, about 10-about 100 pmol ofeach adaptor, and about 400-about 2,000 Units of T4 DNA Ligase.Ligations are performed at 16° C.-37° C. for 1 hour, followed byinactivation of the ligase at 75° C. for 15 minutes. The products ofligation can be stored at −20° C. to 4° C. until amplification.

In a specific embodiment described in example 30, conversion of shortrestriction fragments, the products of secondary restriction cleavage,to amplifiable libraries is achieved by ligation of Y1 and Y2 universaladaptors (Table V) comprising unique sequences containing only C and T(non-Watson-Crick pairing bases) on one strand and having a CG5′overhang on the opposite (A and G) strand to the GC overhangs of therestriction fragments produced by digestion with methylation-sensitiverestriction enzymes. Digested and filtered library DNA is incubated withY1 and Y2 adaptors each present at 0.6 μM and 1,200 units of T4 DNAligase in 45 μl of 1× T4 DNA ligase buffer (NEB) for 50 min at 16° C.followed by 10 min at 25° C.

7. Extension of the 3′ End of the DNA Fragment to Fill in the SecondaryAdaptors

Due to the lack of a phosphate group at the 5′ end of the adaptor, onlyone strand of the adaptor (3′ end) will be covalently attached to theDNA fragment. A 72° C. extension step is performed on the DNA fragmentsin the presence of 1× DNA polymerase, 1× PCR Buffer, 200 μM of eachdNTP, and 1 uM universal primer. This step may be performed immediatelyprior to amplification using Taq polymerase, or may be carried out usinga thermo-labile polymerase, such as if the libraries are to be storedfor future use.

8. One-Step Attachment of Secondary Adaptors

In a preferred embodiment, a one-step process utilizing a dU-HairpinAdaptor method described in Examples 33, 38, and 39 is used forpreparation of secondary Methylome libraries. In this case, a mixture ofhairpin oligonucleotides comprising two different known sequences shouldbe used to avoid the PCR suppression effect which is known to inhibitamplification of very short DNA amplicons when identical sequenceattached to both ends is used.

9. Amplification of the Secondary Methylation Library

The amplification of secondary methylation libraries involves use ofuniversal primers complementary to the secondary adaptors. Two or moresecondary adaptors are utilized to allow amplification of smallmolecules that would otherwise fail to amplify with a single adaptorsequence resulting from PCR suppression. A typical amplification stepcomprises between about 1 and about 25 ng of library products andbetween about 0.3 and about 1 μM of each secondary adaptor sequenceprimer in a standard PCR reaction well known in the art, underconditions optimal for a thermostable DNA polymerase, such as Taq DNApolymerase, Pfu polymerase, or derivatives and mixtures thereof.

In a specific embodiment described in example 30 libraries are preparedfor micro-array analysis by amplification with PCR and monitored in realtime using a reaction mixture containing final concentrations of: 1×Titanium Taq reaction buffer (Clontech), 200 μM of each dNTP,fluorescein calibration dye (1:100,000) and SYBR Green I (1:100,000),0.25 μM each of universal primers (Table V, SEQ ID NO: 168 and SEQ IDNO: 170), 4% DMSO, 200 μM 7-deaza-dGTP (Sigma), and 5 units of TitaniumTaq polymerase (Clontech) in a final volume of 75 μl. After an initialincubation at 75° C. for 10 min to fill-in the recessed 3′ends of theligated restriction fragments, amplifications were carried out at 95° C.for 3 min, followed by 13 cycles of 94° C. for 15 sec and 65° C. for 1.5min on an I-Cycler real-time PCR instrument (Bio-Rad). Amplifiedlibraries from cancer or normal DNA were pooled and used as template inPCR labeling for subsequent microarray hybridizations.

10. Analysis of the Amplified Products to Determine the MethylationStatus of Target DNA

Aliquots of the amplified library DNA are analyzed for the presence ofsequence adjacent to CpG sites. This can be achieved by quantitativereal-time PCR amplification, comparative hybridization,ligation-mediated PCR, ligation chain reaction (LCR), fluorescent orradioactive probe hybridization, probe amplification, hybridization topromoter microarrays comprising oligonucleotides or PCR fragments, or byprobing microarray libraries derived from multiple samples with labeledPCR or oligonucleotide probes. The magnitude of the signal will beproportional to the level of methylation of a promoter site.

A typical quantitative real-time PCR-based methylation analysis reactioncomprises 1× Taq polymerase reaction buffer, about 10 to about 50 ng oflibrary DNA, about 200 to about 400 nM of each specific primer, about 4%DMSO, 0 to about 0.5 M betaine (Sigma), 1:100,000 dilutions offluorescein calibration dye (FCD) and SYBR Green I (SGI) (MolecularProbes), and about 5 units of Taq polymerase. PCR is carried out on anI-Cycler real-time PCR system (BioRad) using a cycling protocoloptimized for the respective primer pair and for the size and the basecomposition of the analyzed amplicon.

Alternatively, a method for analyzing all of the sequences at one timeis presented in FIG. 44. The reduced complexity of the secondarymethylome library allows amplification of subsets of these librariesthrough use of a single 3′ nucleotide used as a selector. A combinationof 4 A adaptors and 4 B adaptors will result in 16 amplificationreactions, containing a greatly reduced number of sequences. Theseamplified products can be analyzed by capillary electrophoresis whichallows the resolution of the different fragments without a prioriknowledge of the identity of the sequences. Finally, the amplificationproducts of the secondary methylation library can be analyzed bysequencing to allow the identification of the specific fragments ofinterest identified during capillary electrophoresis.

In a specific embodiment described in Example 30 the distribution ofpromoter sites and the level of their enrichment in amplified secondarymethylome libraries from cancer DNA are analyzed by quantitative PCRusing primer pairs amplifying short amplicons that do not containrecognition sites for at least two of the methylation-sensitiverestriction enzymes employed in the present example (Table V, SEQ IDNOs: 190-197). Methylated promoters are enriched between 16 and 128-fold(FIG. 66) relative to a control non-amplified genomic DNA. For anon-methylated promoter no detectable product is amplified (see FIG.66).

11. Sources of DNA for Methylation Analysis

Genomic DNA of any source or complexity, or fragments thereof, can beanalyzed by the methods described in the invention. Clinical samplesrepresenting biopsy materials, pap smears, DNA from blood cells, serum,plasma, urine, feces, cheek scrapings, nipple aspirate, saliva, or otherbody fluids, DNA isolated from apoptotic cells, or cultured primary orimmortalized tissue cultures can be used as a source for methylationanalysis.

E. Methylation Analysis of Substantially Fragmented DNA Using LibrariesDigested with Methylation-Sensitive Restriction Endonucleases that HaveRecognition Sites Comprising CpG Dinucleotides

In this embodiment, there are methods for preparing libraries fromsubstantially fragmented DNA molecules in such a way as to select forsequences that comprise recognition sites for methylation-sensitiverestriction endonucleases in regions with high GC content, such aspromoter CpG islands. In a preferred embodiment, serum, plasma or urineDNA, for example, is the source of the starting material. DNA isolatedfrom serum, plasma, and urine has a typical size range of approximately200 bp to 3 kb, based on gel analysis. Furthermore, this material can beconverted into libraries and amplified by whole genome amplificationmethodologies (see, for example, U.S. patent application Ser. No.10/797,333, filed Mar. 8, 2004, published as U.S. Patent ApplicationPublication No.: 2004/0209299 and is now abandoned; and citationsherein). The synthesis of these libraries involves techniques that donot affect the methylation status of the starting DNA. It is apparent tothose skilled in the art that the starting material can be obtained fromany source of tissue and/or procedure that yields DNA withcharacteristics similar to those obtained from serum, plasma, and urineDNA (for example, DNA enzymatically degraded by one or severalrestriction endonucleas, DNase I, McrBC nuclease, or a combination ofthereof; DNA extracted from formalin-fixed, paraffin-embeded tissues;DNA isolated from other body fluids; etc.)

Following amplification of the primary methylation library, themethylated sites in the starting DNA are converted into unmethylatedsites. Thus, in a second embodiment (see Example 25), the amplificationproducts of the primary methylation libraries are amplified with auniversal primer comprising a 5′ poly-C sequence. Followingamplification, the resulting products are digested with the samemethylation-sensitive restriction endonuclease used during creation ofthe primary methylation library. Subsequently, a second adaptor isligated to the resulting fragments. Amplification is carried out using aprimer complementary to the second adaptor in conjunction with a poly-Cprimer. The resulting amplicons will comprise only those molecules thathave the second adaptor at one or both ends. Molecules that were notdigested during creation of the secondary methylation library will nothave the second adaptor attached and will not be amplified by the poly-Cprimer. This lack of amplification of molecules containing a poly-Cprimer at both ends has been documented, for example, in U.S. patentapplication Ser. No. 10/293,048, filed Nov. 13, 2002, now U.S. Pat. No.7,655,791; U.S. patent application Ser. No. 10/797,333, filed Mar. 8,2004, published as U.S. Patent Application Publication No.: 2004/0209299and is now abandoned; U.S. patent application Ser. No. 10/795,667, filedMar. 8, 2004, now U.S. Pat. No. 7,718,403. Thus, the products ofamplification of the secondary methylation library will be enriched inmolecules that comprised a methylated restriction endonucleaserecognition site in the starting material. These products can beanalyzed by methods similar to those utilized for the analysis of theprimary methylation library products, or they can be sequenced todetermine sites for which there is no a priori knowledge of methylation.

In one specific embodiment (such as Examples 24 and 31), the Methylomelibraries are prepared from serum DNA, digested withmethylation-sensitive restriction endonucleases, amplified withuniversal primer, and analyzed for specific sequences that weremethylated in the starting material using real-time PCR. The principleof this method is disclosed in U.S. patent application Ser. No.10/797,333, filed Mar. 8, 2004, published as U.S. Patent ApplicationPublication No.: 2004/0209299 and is now abandoned. Cell-free DNA isisolated from serum or urine of healthy donors or from prostate cancerpatients. This DNA displays typical banding pattern characteristic ofapoptotic nucleosomal size. To repair DNA and generate blunt ends, theDNA is incubated with Klenow fragment of DNA polymerase I in thepresence of all four dNTPs. Ligation of universal K_(U) adaptor (TableVI) is then performed using T4 DNA ligase. Samples are purified byethanol precipitation and split into 2 aliquots. One aliquot is digestedwith a cocktail of methylation-sensitive restriction enzymes AciI, HhaI,BstUI, HpaII, and Hinp1I. The second aliquot is incubated in parallelbut without restriction enzymes (“uncut” control). Libraries areamplified by quantitative real-time PCR universal primer K_(U) (TableVI, SEQ ID NO: 15) in the presence of additives that facilitatereplication through promoter regions with high GC content and excessivesecondary structure. Amplified library DNA is purified, and the presenceof amplifiable promoter sequences in the libraries comprising one ormore CpG sites as part of the methylation-sensitive restriction enzymesrecognition sequences is analyzed by quantitative PCR using specificprimers flanking such sites. FIG. 55 shows typical amplification curvesof promoter sites for genes implicated in cancer from Methylomelibraries synthesized from the serum DNA of cancer patients as comparedto healthy donor controls. As expected, the level of methylation inserum DNA from cancer patients was much lower than in tumor tissue orcancer cell lines, since cancer DNA in circulation represents only arelatively small fraction of the total cell-free DNA. The methoddisclosed here is very sensitive to reliably detect methylation in bodyfluids and can be applied as a diagnostic tool for early detection,prognosis, or monitoring of the progression of cancer disease.

In another specific embodiment (Examples 24, 31), the Methylomelibraries are created from urine DNA as described above, digested withmethylation-sensitive restriction endonucleases, amplified withuniversal primer, and analyzed for specific sequences that weremethylated in the starting material using real-time PCR. FIG. 56 showstypical amplification curves of promoter sites for genes implicated incancer from methylome libraries synthesized from urine DNA of cancerpatients as compared to healthy donor controls. As expected, the levelof methylation in urine DNA from cancer patients was much lower than intumor tissue or cancer cell lines, since cancer DNA in circulationrepresents only a relatively small fraction of the total cell-free DNA.This trend is especially pronounced for urine DNA. The method disclosedhere is very sensitive to reliably detect methylation in body fluids andcan be applied as a diagnostic tool for early detection, prognosis, ormonitoring of the progression of cancer disease.

The resulting products can also be analyzed by sequencing, ligationchain reaction, ligation-mediated polymerase chain reaction, probehybridization, probe amplification microarray hybridization, acombination thereof, or other methods known in the art, for example.

In one specific embodiment, preparation of the Methylome library fromcell-free urine DNA is further optimized. Example 32 describes thedevelopment of a single-tube library preparation and amplificationmethod for Methylome libraries from urine DNA and its advantages over atwo-step protocol described in the Example 31. The disclosed inventionallows elimination of the DNA precipitation step introduced in theExample 31 protocol after ligation reaction and directluse of the DNAsample after ligation reaction in the restriction digestion reaction. Inthe single tube method, the entire process takes place in a universalbuffer that supports all enzymatic activities. Klenow fragment of DNApolymerase I, T4 DNA ligase, and the mix of methylation-sensitiverestriction enzymes are added sequentially to the same tube. Librariesare amplified by quantitative real-time PCR with universal primer K_(U)(Table VI, SEQ ID NO: 15) in the presence of additives that facilitatereplication through promoter regions with high GC content and excessivesecondary structure. Amplified library DNA is purified and the presenceof amplifiable promoter sequences in the libraries comprising one ormore CpG sites as part of the methylation-sensitive restriction enzymesrecognition sequences is analyzed by quantitative PCR using specificprimers flanking such sites. Digested samples from the single tubeprotocol have a greatly reduced background as compared to the two stepprotocol, whereas the uncut samples amplified identically (FIG. 57).This results in significant improvement of the dynamic range of theassay. Another advantage of the single tube protocol is reduced hands-ontime and improved high throughput and automation capability.

In another specific embodiment (Example 34), it is demonstrated thatselection of DNA polymerase is critical for the preservation of DNAmethylation within the promoter regions during the Methylome librarysynthesis. Cell free DNA in urine or circulating in plasma and serum islikely to be excessively nicked and damaged due to their naturalapoptotic source and presence of nuclease activities in blood and urine.During repair of ends using a DNA polymerase with 3′-exonucleaseactivity, internal nicks are expected to be extended, a process that canpotentially lead to replacement of methyl-cytosine with non-methylatedcytosine and loss of the methylation signature. The stronger the stranddisplacement (or nick-translation) activity of the polymerase, the morelikely the 5′-methyl cytosine would be replaced with normal cytosineduring the repair process. Example 34 compares two DNA polymerases (T4DNA polymerase and Klenow fragment of DNA polymerase I) capable ofpolishing DNA termini to produce blunt ends and the ability of each topreserve the methylation signature of CpG islands prior to cleavage withmethylation-sensitive restriction enzymes.

As shown on FIG. 59, when fully methylated urine DNA was treated withKlenow fragment of DNA polymerase I prior to restriction cleavage a 2-3cycle shift of the amplification curves was observed, suggesting that asignificant fraction (estimated 75% to 90%) of methyl-cytosine was lostduring the DNA end repair. On the other hand, when T4 polymerase wasused for DNA end repair, the shift was only one cycle or less dependingon the site analyzed. This suggests that 50% or more of themethyl-cytosine was preserved. These results are in agreement withliterature data showing that E. coli DNA polymerase I has strongerstrand-displacement activity than T4 polymerase. Thus, T4 DNA polymeraseis the preferable enzyme to produce blunt ends for methylome librarypreparation from urine or other sources of degraded or nicked DNA.

In one specific embodiment (Example 38), preparation of the Methylomelibraries from cell-free DNA is further simplified to combine threeprocesses, specifically, DNA end “polishing” reaction, adaptor ligationreaction, and “fill-in” end synthesis reaction into one single reaction.A single step preparation of the genomic library from cell-free urineDNA utilizes a special hairpin oligonucleotide adaptor containingdeoxy-uridine in both its 5′ stem region and in its loop (Table VI, SEQID NO: 172). The hairpin oligonucleotide is ligated via its free 3′ endto the 5′ phosphates of target DNA molecules in the presence of 3enzymatic activities: T4 DNA ligase, DNA polymerase, and Uracil-DNAglycosylase (UDG). Several reactions proceed simultaneously: T4 DNApolymerase creates blunt ends on DNA fragments and maintains blunt endson the hairpin adaptor; UDG catalyzes the release of free uracil andcreates abasic sites in the adaptor's loop region and the 5′ half of thehairpin; T4 DNA ligase ligates the 3′ end of the hairpin adaptor to the5′ phosphates of target DNA molecules; and the strand-displacementactivity of the DNA polymerase extends the 3′ end of DNA into theadaptor region until an abasic site (region) is reached that serves as areplication stop. This process results in truncated 3′ ends of thelibrary fragments such that they do not have terminal inverted repeats.The entire process takes place in a single tube in one step and iscompleted in just 1 hour, for example. It is followed by multiplemethylation-sensitive restriction enzyme digestion with a cocktail of,for example, Aci I, Hha I, Hpa II, HinP1 I, and Bst UI enzymes, PCRamplification, and methylation analysis by real-time PCR, for example.

FIG. 63 shows PCR amplification curves of specific promoter sites fromamplified libraries prepared from methylated or non-methylated urine DNAwith or without cleavage with methylation-sensitive restriction enzymes.As expected, promoter sites from non-methylated cleaved DNA amplifiedwith significant (at least 10 cycles) delay as compared to uncut DNA forall four promoter sites tested. On the other hand, methylated DNA isrefractory to cleavage.

In another specific embodiment (Example 39), preparation of theMethylome libraries from cell-free DNA is simplified to its theoreticallimit by combining all four processes, specifically, DNA end “polishing”reaction, adaptor ligation reaction, “fill-in” end synthesis reaction,and multiple methylation-sensitive restriction enzyme digestion into onesingle step. A single step preparation of the Methylome library fromcell-free urine DNA utilizes a special hairpin oligonucleotide adaptorcomprising deoxy-uridine in both its 5′ stem region and in its loop(Table VI, SEQ ID NO: 172). The hairpin oligonucleotide is ligated viaits free 3′ end to the 5′ phosphates of target DNA molecules in thepresence of 3 enzymatic activities: T4 DNA ligase, DNA polymerase, andUracil-DNA glycosylase (UDG). Several reactions proceed simultaneously:T4 DNA polymerase creates blunt ends on DNA fragments and maintainsblunt ends on the hairpin adaptor; UDG catalyzes the release of freeuracil and creates abasic sites in the adaptor's loop region and the 5′half of the hairpin; T4 DNA ligase ligates the 3′ end of the hairpinadaptor to the 5′ phosphates of target DNA molecules; thestrand-displacement activity of the DNA polymerase extends the 3′ end ofDNA into the adaptor region until an abasic site (region) is reachedwhich serves as a replication stop; and finally, a cocktail ofmethylation sensitive restriction enzymes (such as the exemplary Aci I,Hha I, Hpa II, HinP1 I, and Bst UI) degrades non-methylated CpG-richregions within the continuously prepared Methylome library. This processresults in truncated 3′ ends of the library fragments such that they donot have terminal inverted repeats. The entire process takes place in asingle tube in one step and is completed within 1 hour. It is followedby PCR amplification and methylation analysis by real-time PCR, forexample.

FIG. 64 shows PCR amplification curves of specific promoter sites inamplified libraries prepared from methylated or non-methylated urine DNAin the presence or in the absence of methylation-sensitive restrictionenzymes. As expected, promoter sites from non-methylated cleaved DNAamplified with significant (at least 10 cycles) delay as compared touncut DNA for all four promoter sites tested. On the other hand,methylated DNA is completely refractory to cleavage. These resultsdemonstrate that the unique Methylome library preparation methoddisclosed in the present invention can be applied as a simple one stepnon-invasive high-throughput diagnostic procedure for detection ofaberrant methylation in cancer.

In a preferred embodiment, there is a method for the preparation ofMethylome libraries from substantially fragmented DNA in a multi-enzymesingle step reaction that simultaneously involves DNA, DNA polymerase,DNA ligase, deoxy-uridine-comprising oligonucleotide adaptor, a mix ofmethylation-sensitive restriction enzymes, and a buffer system thatsupports all of these enzymatic activities (FIG. 68D). The DNApolymerase is preferably T4 DNA polymerase or Klenow fragment of E. coliDNA polymerase I, the DNA ligase is T4 DNA ligase, the cocktail ofmethylation-specific restriction enzymes comprises the following: Aci I,BstU I, Hha I, HinP1 I, HpaII, Hpy99 I, Ava I, Bce AI, Bsa HI, Bsi E1,Hga I, or a mixture thereof. The attached hairpin adaptor comprisesdeoxy-uridine in its loop that is converted to a replication stopcomprising an abasic site, and the enzyme that converts deoxy-uridine toan abasic site is uracil-DNA glycosylase. The universal buffer thatefficiently supports those enzyme activities is, for example, NewEngland Biolabs buffer 4 (NEBuffer 4).

In some embodiments where the DNA molecule comprises nicked, partiallysingle-stranded or otherwise damaged DNA, such as, for example,cell-free serum or urine DNA, the polymerase of choice is a DNApolymerase with reduced strand-displacement activity, such as T4 DNApolymerase.

An exemplary multi-enzyme single-step reaction that simultaneouslyinvolves DNA, DNA polymerase, DNA ligase, deoxy-uridine-comprisingoligonucleotide adaptor, and a mix of methylation-sensitive restrictionenzymes is performed in a reaction mixture having volume ranging frombetween about 10 and about 50 μl. The reaction mixture preferablycomprises about 0.5 to about 100 ng of DNA, or in particular embodimentsless than about 0.5 ng DNA, between 0.5-about 5 μM of deoxy-uridinecontaining hairpin adaptor, between 1-about 200 μM of all four dNTPs,between 0.1-about 10 mM ATP, between 0-about 0.1 mg/ml of bovine serumalbumin (BSA), between 0.1-about 10 units of T4 DNA polymerase or Klenowfragment of E. coli DNA polymerase I, between 0.1-about 10 units ofuracil-DNA glycosylase (UDG), between 10about 5,000 units of T4 DNAligase, and between about 0.1-about 50 units of a methylation-sensitiverestriction endonuclease including but not limited to the following: AciI, BstU I, Hha I, HinP1 I, HpaII, Hpy99 I, Ava I, Bce AI, Bsa HI, BsiE1, Hga I or a mixture thereof. The reaction buffer preferably has abuffering capacity that is operative at physiological pH between about6.5 and about 9. Preferably, the incubation time of the reaction isbetween about 10 to about 180 min, and the incubation temperature isbetween about 16° C.-about 42° C. in a buffer that efficiently supportsall enzymatic activities such as the exemplary NEBuffer 4 (5 mMpotassium acetate, 20 mM Tris-acetate, 10 mM magnesium acetate, 1 mMdithiothreitol, pH 7.9 at 25° C.).

In a specific embodiment (Example 33), there is analysis anddetermination of the dynamic range and sensitivity limits of methylationdetection in cell-free urine DNA samples using mixed libraries ofartificially methylated and non-methylated DNA. As shown on FIG. 58 aslittle as 0.01% of methylated DNA can be reliably detected in thebackground of 99.99% of non-methylated DNA. The figure also shows thatthe method disclosed in the present invention has a dynamic range of atleast 3 orders of magnitude.

1. Attachment of Adaptors

There are two specific methods for the attachment of universal adaptorsto the ends of DNA isolated from serum and plasma. Both of these methodshave been detailed in U.S. patent application Ser. No. 10/797,333, filedMar. 8, 2004, published as U.S. Patent Application Publication No.:2004/0209299 and is now abandoned, and are included in their entirety bymention herein. The first method involves the polishing of the 3′ endsof serum or plasma DNA to create blunt ends, followed by ligation of theuniversal adaptor. The second method involves ligation of universaladaptors with a combination of specific 5′ and 3′ overhangs to the endsof the serum or plasma DNA.

a. Polishing of Serum, Plasma, and Urine DNA and Ligation of UniversalAdaptors

DNA that has been isolated from serum and plasma has been demonstratedto have at least three types of ends: 3′ overhangs, 5′ overhangs, andblunt ends (U.S. patent application Ser. No. 10/797,333, filed Mar. 8,2004, published as U.S. Patent Application Publication No.: 2004/0209299and is now abandoned; and references herein). In order to effectivelyligate the adaptors to these molecules and extend these molecules acrossthe region of the known adaptor sequence, the 3′ ends need to berepaired so that preferably the majority of ends are blunt. Thisprocedure is carried out by incubating the DNA fragments with a DNApolymerase having both 3′ exonuclease activity and 3′ polymeraseactivity, such as Klenow or T4 DNA polymerase, for example. Althoughreaction parameters may be varied by one of skill in the art, in anexemplary embodiment incubation of the DNA fragments with Klenow in thepresence of 40 nmol dNTP and 1× T4 DNA ligase buffer results in optimalproduction of blunt end molecules with competent 3′ ends.

Alternatively, Exonuclease III and T4 DNA polymerase can be utilized toremove 3′ blocked bases from recessed ends and extend them to form bluntends. In a specific embodiment, an additional incubation with T4 DNApolymerase or Klenow maximizes production of blunt ended fragments with3′ ends that are competent to undergo ligation to the adaptor.

In specific embodiments, the ends of the double stranded DNA moleculesstill comprise overhangs following such processing, and particularadaptors are utilized in subsequent steps that correspond to theseoverhangs.

Urine DNA is likely to be excessively nicked and damaged. During repairof ends using DNA polymerase with 3′-exonuclease activity, internalnicks are expected to be extended, a process that can potentially leadto replacement of methyl-cytosine with non-methylated cytosine. Thestronger the strand displacement (or nick-translation) activity of thepolymerase, the more methyl-cytosine would be lost in the process.Example 34 compares two DNA polymerases capable of polishing DNA terminito produce blunt ends competent for ligation for their ability topreserve methylation of CpG islands prior to cleavage withmethylation-sensitive restriction enzymes.

Cell-free DNA isolated from urine is artificially methylated at all CpGsites by incubation with M.Sssl CpG methylase in the presence ofS-adenosylmethionine (SAM). Two aliquots of methylated DNA are processedfor enzymatic repair of termini by incubation with Klenow fragment ofDNA polymerase I or with T4 DNA Polymerase in the presence of all fourdNTPs. Samples are ligated to universal K_(U) adaptor (Table VI) with T4DNA ligase, and split into 2 aliquots. One aliquot is digested with acocktail of methylation-sensitive restriction enzymes AciI, HhaI, BstUI,HpaII, and Hinp1I. The second aliquot is incubated in parallel butwithout restriction enzymes (“uncut” control).

Libraries are amplified by PCR with universal primer K_(U) (Table VI,SEQ ID NO: 15) and the presence of promoter sequences in the amplifiedlibraries comprising one or more CpG sites as part of themethylation-sensitive restriction enzymes recognition sequences isanalyzed by quantitative PCR using specific primers flanking such sites.

When methylated urine DNA was treated with Klenow fragment of DNApolymerase I prior to restriction cleavage this resulted in 75% to 90%loss of methyl-cytosine during the enzymatic repair. On the other hand,when T4 polymerase was used for polishing, 50% or more of themethyl-cytosine was preserved.

Thus, in a particular embodiment, the DNA polymerase used for repair ofDNA prior to methylome library preparation is T4 DNA polymerase.

b. Ligation of Universal Adaptors with 5′ and 3′ Overhangs to Serum andPlasma DNA

DNA that has been isolated from serum and plasma has been demonstratedto have at least three types of ends: 3′ overhangs, 5′ overhangs, andblunt ends (U.S. patent application Ser. No. 10/797,333, filed Mar. 8,2004, published as U.S. Patent Application Publication No.: 2004/0209299and is now abandoned; and references herein). This mixture of endsprecludes the ligation of a universal adaptor with a single type of end.Thus, a specific mixture of adaptor sequences comprising both 5′overhangs of 2, 3, 4, and 5 bp, and 3′ overhangs of 2, 3, 4, and 5 bphas been developed and demonstrated to yield optimal ligation to serumand plasma DNA. The characteristics of ligation of this mixture to serumand plasma DNA has been documented in U.S. patent application Ser. No.10/795,667, filed Mar. 8, 2004, now U.S. Pat. No. 7,718,403. Theseadaptors are illustrated in FIG. 48. These adaptors are comprised of twooligos, 1 short and 1 long, which are hybridized to each other at someregion along their length. In a specific embodiment, the long oligo is a20-mer that will be ligated to the 5′ end of fragmented DNA. In anotherspecific embodiment, the short oligo strand is a 3′ blocked 11-mercomplementary to the 3′ end of the long oligo. A skilled artisanrecognizes that the length of the oligos that comprise the adaptor maybe modified, in alternative embodiments. For example, a range of oligolength for the long oligo is about 18 bp to about 100 bp, and a range ofoligo length for the short oligo is about 7 bp to about 20 bp.Furthermore, the structure of the adaptors has been developed tominimize ligation of adaptors to each other via at least one of threemeans: 1) absence of a 5′ phosphate group necessary for ligation; 2)presence of about a 7 bp 5′ overhang that prevents ligation in theopposite orientation; and/or 3) a 3′ blocked base preventing fill-in ofthe 5′ overhang.

A typical ligation procedure involves the incubation of 1 to 100 ng ofDNA in 1× T4 DNA ligase buffer, 10 pmol of each adaptor, and 400 Unitsof T4 DNA Ligase. Ligations are performed at 16° C. for 1 hour, followedby inactivation of the ligase at 75° C. for 15 minutes. The products ofligation can be stored at −20° C. to 4° C. until amplification.

In a particular embodiment, the adaptor of choice is partiallydouble-stranded self-inert sequence comprising nonWatson-Crick bases(for example universal K_(U) adaptor (Table VI). Ligation of universaladaptor is performed preferably in a buffer system supporting allenzymatic activities used for methylome library synthesis such as NewEngland Biolabs Buffer 4 (NEBuffer 4).

An exemplary adaptor ligation is performed in a reaction mixture havingvolume ranging from between about 5 and about 50 μl, for example. Thereaction mixture preferably comprises about 0.5 to about 100 ng of DNA,or in particular embodiments less than about 0.5 ng DNA, between0.5-about 10 μM of partially double-stranded self-inert adaptor(universal K_(U) adaptor, Table VI), between 0.1-about 10 mM ATP, andbetween 10-about 5,000 units of T4 DNA ligase or another suitable DNAligase. Preferably, the incubation time of the reaction is between about10 to about 180 min, and the incubation temperature is between about 16°C.-about 42° C. in a buffer that efficiently supports all enzymaticactivities, such as the exemplary NEBuffer 4 (5 mM potassium acetate, 20mM Tris-acetate, 10 mM magnesium acetate, 1 mM dithiothreitol, pH 7.9 at25° C.).

2. Choice of Restriction Endonuclease

Methylation-sensitive restriction enzymes with recognition sitescomprising the CpG dinucleotide and no adenine or thymine are expectedto cut genomic DNA with much lower frequency as compared to theircounterparts having recognition sites with normal GC to AT ratio. Thereare two reasons for this. First, due to the high rate of methyl-cytosineto thymine transition mutations, the CpG dinucleotide is severelyunder-represented and unequally distributed across the human genome.Large stretches of DNA are depleted of CpGs and thus do not containthese restriction sites. Second, most methylated cytosine residues arefound in CpG dinucleotides that are located outside of CpG islands,primarily in repetitive sequences. Due to methylation, these sequenceswill also be protected from cleavage. On the other hand, about 50 to 60%of the known genes comprise CpG islands in their promoter regions andthey are maintained largely unmethylated, except in the cases of normaldevelopmental gene expression control, gene imprinting, X chromosomesilencing, or aberrant methylation in cancer and some other pathologicalconditions, for example. These CpG islands will be digested by themethylation-sensitive restriction enzymes in normal gene promoter sitesbut not in aberrantly methylated promoters. Four base GC recognitionrestriction enzymes as exemplified by Aci I, BstU I, Hha I, HinP1 I, andHpa II with recognition sites CCGC, CGCG, GCGC, and CCGG, respectively(Table III), are particularly useful since they will frequently cutnon-methylated DNA in CpG islands, but not methylated DNA. A completelist of methylation-sensitive restriction endonucleases is presented inTable III.

In preferred embodiments the methylation-sensitive restriction enzymesinclude but are not limited to the following: Aci I, BstU I, Hha I,HinP1 I, HpaII, Hpy99 I, Ava I, Bce AI, Bsa HI, Bsi E1, Hga I or amixture thereof.

3. Restriction Digestion of Target DNA

In a specific embodiment, target DNA is digested with amethylation-sensitive restriction endonuclease(s), such as Aci I, BstUI, Hha I, HinP1 I, and Hpa II or a compatible combination thereof Thedigestion reaction comprises about 0.1 ng to 5 μg of genomic DNA, 1×reaction buffer, and about 1 to about 25 units of restrictionendonuclease(s). The mixture is incubated at 37° C. or at the optimaltemperature of the respective endonuclease for about 1 hour to about 16hour to ensure complete digestion. When appropriate, the enzyme isinactivated at 65° C. to 70° C. for 15 minutes and the sample isprecipitated and resuspended to a final concentration of 1 to 50 ng/μl.Genomic DNA that has not been digested is used as a positive controlduring library preparation and analysis.

In preferred embodiments the methylation-sensitive restriction enzymesinclude but are not limited to the following: Aci I, BstU I, Hha I,HinP1 I, HpaII, Hpy99 I, Ava I, Bce AI, Bsa HI, Bsi E1, Hga I or amixture thereof. The buffer for restriction digestion will support allenzymatic activities, for example NEBuffer 4 or other compatible buffersystem. To achieve complete digestion, the incubation times can varybetween about 1 hour and about 24 hours, for example. Incubationtemperatures can also vary depending on the optimal temperature of aparticular enzyme or a combination of enzymes. Stepwise incubations canbe performed to accommodate the optimal temperatures of multiplerestriction enzymes. In an exemplary methylation-sensitive restrictiondigestion, a target DNA and a cocktail of enzymes comprising AciI, HhaI,BstUI, HpaII, and Hinp1I is carried out for 12-18 hours at 37° C., theoptimal temperature for AciI, HhaI, HpaII, and Hinp1I, followed by 2hours at 60° C., the optimal temperature for BstUI.

A skilled artisan recognizes that a complete cleavage of DNA is criticalin the analysis of promoter hypermethylation from clinical samples wheremethylated cancer DNA only represents a small fraction of the total DNA.To relax any possible constraints imposed on restriction cleavage ofpromoter sequences by high GC content and secondary structure that canmake cleavage incomplete, one can envision using specific treatments oradditives that can facilitate relaxation. One such treatment is heatingthe DNA to temperatures that are not denaturing yet high enough to relaxsecondary structure and promote proper Watson-Crick base pairing.Example 28 illustrates the effect of pre-heating of genomic DNA on theefficiency of cleavage by the methylation-sensitive restriction enzymeAci I. Genomic DNA is pre-heated for 30 minutes at 85° C., 90° C., or95° C. and analyzed by quantitative PCR for amplification of a promoterregion of the human p16 gene that is very GC-rich and comprisesexcessive secondary structure. Pre-heating at 85° C. reproduciblyimproves the cleavage by about a factor of 2 as compared to control thatwas not pre-heated. This improvement of cleavage by pre-heating at 85°C. was demonstrated for multiple promoter sites and restriction enzymes.Thus, in specific embodiments genomic DNA is preheated to 85° C. priorto cleavage with restriction enzymes.

4. Extension of the 3′ end of the DNA Fragment to Fill in the UniversalAdaptor

Due to the absence of a phosphate group at the 5′ end of the adaptor,only one strand of the adaptor (3′ end) will be covalently attached tothe DNA fragment. A 72° C. extension step is performed on the DNAfragments in the presence of 1× DNA polymerase, 1× PCR Buffer, 200 μM ofeach dNTP, and 1 uM universal primer. This step may be performedimmediately prior to amplification using Taq polymerase or may becarried out using a thermo-labile polymerase, such as if the librariesare to be stored for future use, for example.

5. Amplification of Primary Methylation Library

A typical amplification step with universal sequence primer comprisesbetween about 1 and about 25 ng of library products and between about0.3 and about 2 μM of universal sequence primer with or without thepresence of a poly-C sequence at the 5′ end, in a standard PCR reactionwell known in the art, under conditions optimal for a thermostable DNApolymerase, such as Taq DNA polymerase, Pfu polymerase, or derivativesand mixtures thereof.

6. Analysis of the Amplified Products to Determine the MethylationStatus of Target DNA

Aliquots of the amplified library DNA are analyzed for the presence ofCpG sites or regions encompassing more than one such site. This can beachieved by quantitative real-time PCR amplification, comparativehybridization, ligation-mediated PCR, ligation chain reaction (LCR),fluorescent or radioactive probe hybridization, probe amplification,hybridization to promoter microarrays comprising oligonucleotides or PCRfragments, or by probing microarray libraries derived from multiplesamples with labeled PCR or oligonucleotide probes, for example. Themagnitude of the signal will be proportional to the level of methylationof a promoter site.

A typical quantitative real-time PCR-based methylation analysis reactioncomprises 1× Taq polymerase reaction buffer, about 10 to about 50 ng oflibrary DNA, about 200 to about 400 nM of each specific primer, about 4%DMSO, 0 to about 0.5 M betaine (Sigma), 1:100,000 dilutions offluorescein calibration dye (FCD) and SYBR Green I (SGI) (MolecularProbes), and about 5 units of Taq polymerase. PCR is carried out on anI-Cycler real-time PCR system (BioRad) using a cycling protocoloptimized for the respective primer pair and for the size and the basecomposition of the analyzed amplicon.

3. Sources of DNA for Methylation Analysis

The source of genomic DNA in one embodiment is serum, plasma, or urineDNA. This DNA has been demonstrated to have a size distribution ofapproximately 200 bp to 3 kb. Furthermore, this DNA comprises 5′phosphate groups and 3′ hydroxyl groups that facilitate the attachmentof universal adaptors. Genomic DNA of any source or complexity withcharacteristics similar to those found in DNA from serum and plasma canbe analyzed by the methods described in the invention. Clinical samplescomprising fragmented and/or degraded DNA representing biopsy materials,pap smears, DNA from blood cells, urine, or other body fluids, or DNAisolated from apoptotic cells, and cultured primary or immortalizedtissue cultures can be used as a source for methylation analysis, forexample.

F. Methylation Analysis of Substantially Fragmented DNA Using LibrariesDigested with the Methylation-Specific Restriction Endonuclease McrBC

In this embodiment, there are methods of preparing libraries fromfragmented DNA molecules in such a way as to select for sequences thatcomprise recognition sites for the methylation-specific restrictionendonuclease McrBC. In a preferred embodiment, serum or plasma DNA isthe source of the starting material. DNA isolated from serum and plasmahas a typical size range of approximately 200 bp to 3 kb, based on gelanalysis. Furthermore, this material can be converted into libraries andamplified by whole genome amplification methodologies cited in U.S.patent application Ser. No. 10/797,333, filed Mar. 8, 2004, published asU.S. Patent Application Publication No.: 2004/0209299 and is nowabandoned, for example. The synthesis of these libraries involvestechniques that do not affect the methylation status of the startingDNA. It is apparent to those skilled in the art that the startingmaterial can be obtained from any source of tissue and/or procedure thatyields DNA with characteristics similar to those obtained from serum andplasma DNA.

In one specific embodiment (Example 26, FIG. 47), primary methylationlibraries are synthesized from serum and plasma DNA by ligation of anadaptor comprising a poly-C sequence, and digestion with themethylation-specific restriction endonuclease McrBC. Subsequently, asecond adaptor, or mixture of adaptors, is ligated to the resultingfragments. Amplification of the methylation library is carried out usinga primer complementary to the second adaptor(s) in conjunction with apoly-C primer. The resulting amplicons will comprise only thosemolecules that have the second adaptor at one or both ends. Moleculesthat were not digested by McrBC will not have the second adaptor(s)attached and will not be amplified by the poly-C primer. This lack ofamplification of molecules containing a poly-C primer at both ends hasbeen documented in U.S. patent application Ser. No. 10/293,048, filedNov. 13, 2002, now U.S. Pat. No. 7,655,791; U.S. Patent Application No.10/795,667, filed Mar. 8, 2004, now U.S. Pat. No. 7,718,403; and U.S.patent application Ser. No. 10/797,333, filed Mar. 8, 2004, published asU.S. Patent Application Publication No.: 2004/0209299 and is nowabandoned. Thus, the products of amplification of the secondarymethylation library will be enriched in molecules that comprised two ormore methylated CpGs in the starting material. The resulting productscan be analyzed by PCR, microarray hybridization, probe assay, probehybridization, probe amplification, or other methods known in the art,for example. Alternatively, they can be sequenced to determine sites forwhich there is no a priori knowledge of importance. Due to the variationin where McrBC cleavage occurs between two methylated CpG sites, furtheranalysis may be required to determine which specific CpG sites weremethylated in the starting material in regions comprising three or moreCpGs.

1. Attachment of Adaptors

There are two specific methods for the attachment of universal adaptorsto the ends of DNA isolated from serum and plasma. Both of these methodshave been detailed in U.S. patent application Ser. No. 10/797,333, filedMar. 8, 2004, published as U.S. Patent Application Publication No.:2004/0209299 and is now abandoned, for example. The first methodinvolves the polishing of the 3′ ends of serum or plasma DNA to createblunt ends, followed by ligation of the universal adaptor. The secondmethod involves ligation of universal adaptors with a combination ofspecific 5′ and 3′ overhangs to the serum or plasma DNA. For thisembodiment, the adaptors that are ligated to the ends of the moleculeswill comprise a poly-C sequence, either alone or in combination with auniversal priming sequence. Alternatively, a poly-G sequence can beadded to the ends of the ligated molecules by terminal transferaseaddition.

a. Polishing of Serum and Plasma DNA and Ligation of Universal Adaptors

DNA that has been isolated from serum and plasma has been demonstratedto have at least three types of ends: 3′ overhangs, 5′ overhangs, andblunt ends. In order to effectively ligate the adaptors to thesemolecules and extend these molecules across the region of the knownadaptor sequence, the 3′ ends need to be repaired so that preferably themajority of ends are blunt. This procedure is carried out by incubatingthe DNA fragments with a DNA polymerase having both 3′ exonucleaseactivity and 3′ polymerase activity, such as Klenow or T4 DNApolymerase, for example. Although reaction parameters may be varied byone of skill in the art, in an exemplary embodiment incubation of theDNA fragments with Klenow in the presence of 40 nmol dNTP and 1× T4 DNAligase buffer results in optimal production of blunt end molecules withcompetent 3′ ends.

Alternatively, Exonuclease III and T4 DNA polymerase can be utilized toremove 3′ blocked bases from recessed ends and extend them to form bluntends. In a specific embodiment, an additional incubation with T4 DNApolymerase or Klenow maximizes production of blunt ended fragments with3′ ends that are competent to undergo ligation to the adaptor.

In specific embodiments, the ends of the double stranded DNA moleculesstill comprise overhangs following such processing, and particularadaptors are utilized in subsequent steps that correspond to theseoverhangs.

b. Ligation of Universal Adaptors with 5′ and 3′ Overhangs to Serum andPlasma DNA

DNA that has been isolated from serum and plasma has been demonstratedto have at least three types of ends: 3′ overhangs, 5′ overhangs, andblunt ends.

This mixture of ends precludes the ligation of a universal adaptor witha single type of end. Thus, a specific mixture of adaptor sequencescontaining both 5′ overhangs of 2, 3, and 5 bp, and 3′ overhangs of 2,3, and 5 bp has been developed and demonstrated to yield optimalligation to serum and plasma DNA. The characteristics of ligation ofthis mixture to serum and plasma DNA has been documented in U.S. patentapplication Ser. No. 10/797,333, filed Mar. 8, 2004, published as U.S.Patent Application Publication No.: 2004/0209299 and is now abandoned.These exemplary adaptors are illustrated in FIG. 48. These adaptors arecomprised of two oligos, 1 short and 1 long, which are hybridized toeach other at some region along their length. In a specific embodiment,the long oligo is a 20-mer that will be ligated to the 5′ end offragmented DNA. In another specific embodiment, the short oligo strandis a 3′ blocked 11-mer complementary to the 3′ end of the long oligo. Askilled artisan recognizes that the length of the oligos that comprisethe adaptor may be modified, in alternative embodiments. For example, arange of oligo length for the long oligo is about 18 bp to about 100 bp,and a range of oligo length for the short oligo is about 7 bp to about20 bp. Furthermore, the structure of the adaptors has been developed tominimize ligation of adaptors to each other via at least one of threemeans: 1) absence of a 5′ phosphate group necessary for ligation; 2)presence of about a 7 bp 5′ overhang that prevents ligation in theopposite orientation; and/or 3) presence of a 3′ blocked base preventingfill-in of the 5′ overhang.

A typical ligation procedure involves the incubation of 1 to 100 ng ofDNA in 1× T4 DNA ligase buffer, 10 pmol of each adaptor, and 400 Unitsof T4 DNA Ligase. Ligations are performed at 16° C. for 1 hour, followedby inactivation of the ligase at 75° C. for 15 minutes. The products ofligation can be stored at −20° C. to 4° C. until amplification.

2. Extension of the 3′ end of the DNA fragment to fill in the universaladaptor

Due to the absence of a phosphate group at the 5′ end of the adaptor,only one strand of the adaptor (3′ end) will be covalently attached tothe DNA fragment. An extension step is performed on the DNA fragments inthe presence of Klenow, 1× Buffer, and 40 nmol of each dNTP at 25° C.for 15 minutes, followed by inactivation of the enzyme at 75° C. for 10min, and cooling to 4° C.

3. McrBC Cleavage

In embodiments of the present invention, DNA is digested with McrBCendonuclease in the presence of GTP as the energy source for subunittranslocation. A typical digestion with McrBC endonuclease is performedin a volume ranging from about 5 μl to about 50 μl in buffer comprisingabout 50 mM NaCl, about 10 mM Tris-HC1 having pH of about 7.5 to about8.5, about 100 μg/ml of bovine serum albumin, about 0.5 to about 2 mMGTP, and about 0.2 to about 20 units of McrBC endonuclease. Thetemperature of incubation is between about 16° C. and about 42° C., andthe duration is between about 10 min and about 16 hours. DNA amount inthe reaction is between about 50 pg and about 10 μg. It should be notedthat McrBC makes one cut between each pair of half-sites, cutting closeto one half-site or the other, but cleavage positions are distributedover several base pairs approximately 30 base pairs from the methylatedbase (Panne et al., 1999) resulting in a smeared pattern instead ofdefined bands. In specific embodiments, digestion with McrBC isincomplete and results in predominant cleavage of subset of sitesseparated by about 35 and about 250 bases. In other specific embodimentscleavage is complete and results in digestion of substantially allpossible cleavage sites. Example 3 describes the optimization of thecleavage of human genomic DNA and analysis of the termini produced byMcrBC. It should be noted that from the existing literature the natureof the ends produced by McrBC digestion is not understood. Example 9also details the analysis of the nature of the ends produced by McrBCcleavage.

4. Attachment of Second Adaptor(s)

Following McrBC digestion, the cleavage products are incubated in aligation reaction comprising T4 ligase buffer, about 200 nM to about 1μM of universal adaptors with 5′ overhangs comprising about 5 or 6completely random bases, and about 200 to 2,500 units of T4 DNA ligasefor about 1 hour to overnight at about 16° C. to about 25° C. The T4 DNAligase is inactivated for 10 minutes at 65° C., and the reaction iscooled to 4° C.

5. Extension of the 3′ end of the DNA fragment to fill in the secondadaptors

Due to the absence of a phosphate group at the 5′ end of the adaptors,only one strand of the adaptor (3′ end) will be covalently attached tothe DNA fragment. A 72° C. extension step is performed on the DNAfragments in the presence of 1× DNA polymerase, 1× PCR Buffer, 200 μM ofeach dNTP, and 1 uM universal primer. This step may be performedimmediately prior to amplification using Taq polymerase, or may becarried out using a thermo-labile polymerase, such as if the librariesare to be stored for future use, for example.

6. Amplification of the Methylation Library

The amplification of the secondary methylation library involves use of apoly-C primer, such as C₁₀ (SEQ ID NO: 38), as well as a universalprimer complementary to the second adaptor. A typical amplification stepcomprises between about 1 and about 25 ng of library products andbetween about 0.3 and about 1 μM of second universal sequence primer,and about 1 μM C₁₀ primer (SEQ ID NO: 38), in a standard PCR reactionwell known in the art, under conditions optimal for a thermostable DNApolymerase, such as Taq DNA polymerase, Pfu polymerase, or derivativesand mixtures thereof

7. Analysis of the Amplified Products to Determine the MethylationStatus of Target DNA

Aliquots of the amplified library DNA are analyzed for the presence ofsequences adjacent to CpG sites. This can be achieved by quantitativereal-time PCR amplification, comparative hybridization,ligation-mediated PCR, ligation chain reaction (LCR), fluorescent orradioactive probe hybridization, probe amplification, hybridization topromoter microarrays comprising oligonucleotides or PCR fragments, or byprobing microarray libraries derived from multiple samples with labeledPCR or oligonucleotide probes, for example. The magnitude of the signalwill be proportional to the level of methylation of a promoter site.

A typical quantitative real-time PCR-based methylation analysis reactioncomprises 1× Taq polymerase reaction buffer, about 10 to about 50 ng oflibrary DNA, about 200 to about 400 nM of each specific primer, about 4%DMSO, 0 to about 0.5 M betaine (Sigma), 1:100,000 dilutions offluorescein calibration dye (FCD) and SYBR Green I (SGI) (MolecularProbes), and about 5 units of Taq polymerase. PCR is carried out on anI-Cycler real-time PCR system (Bio-Rad) using a cycling protocoloptimized for the respective primer pair and for the size and the basecomposition of the analyzed amplicon.

In addition, the amplification products of the methylation library canbe analyzed by sequencing. The variability in the site of McrBC cleavagecan complicate the identification of specific methylated CpGs in CpGislands that comprise a high number of methylated sites. Therefore,sequence analysis will allow the direct determination of the specificCpG site adjacent to the cleavage site in regions of DNA that comprisemultiple CpGs in close proximity.

8. Sources of DNA for Methylation Analysis

The source of genomic DNA in one embodiment is serum or plasma DNA. ThisDNA has been demonstrated to have a size distribution of approximately200 bp to 3 kb. Furthermore, this DNA comprises 5′ phosphate groups and3′ hydroxyl groups, which facilitate the attachment of universaladaptors. Genomic DNA of any source or complexity with characteristicssimilar to those found in DNA from serum and plasma can be analyzed bythe methods described in the invention. Clinical samples comprisingsubstantially fragmented and/or degraded DNA representing biopsymaterials, pap smears, DNA from blood cells, urine, or other bodyfluids, or DNA isolated from apoptotic cells, and cultured primary orimmortalized tissue cultures can be used as a source for methylationanalysis.

G. Methylation Analysis of Substantially Fragmented DNA Using MethylomeLibraries Subjected to Bisulfite Conversion

In this embodiment, there are methods for analyzing methylation bypreparing libraries of fragmented DNA molecules in such a way that bothbisulfite-converted library molecules and unconverted library moleculescan be amplified with the same universal primer (FIGS. 49 and 50). Thefragmented DNA molecules may be obtained in an already substantiallyfragmented form, such as purified from serum, plasma, or urine, orgenerated by random fragmentation by enzymatic, mechanical, or chemicalmeans that do not change the methylation status of the original DNA, forexample Libraries are prepared from the fragmented DNA molecules byattaching adaptors resistant to bisulfate conversion. The resistantadaptors have specific sequence requirements and may have a non-hairpinstructure, as described in U.S. patent application Ser. No. 10/797,333,filed Mar. 8, 2004, published as U.S. Patent Application PublicationNo.: 2004/0209299 and is now abandoned, or preferably may have adU-Hairpin structure, as described in Table VI and Examples 33, 38, and39. Non-hairpin adaptors can comprise two different kinds of sequences,one in which the strand that is attached to the DNA fragment does notcomprise cytosines, and a second in which the strand that is attached tothe DNA fragment does not comprise guanines, and all cytosines in thatstrand are methylated. Similarly, hairpin adaptors can comprise twodifferent kinds of sequences, one in which the 3′ stem region that isattached to the DNA fragment does not comprise cytosines, and a secondin which the 3′ stem region does not comprise guanines, and anycytosines are methylated. The adaptors are attached according to methodsas described in U.S. patent application Ser. No. 10/797,333, filed Mar.8, 2004, published as U.S. Patent Application Publication No.:2004/0209299 and is now abandoned, or preferably as described inExamples 33, 35, 38, and 39. To further protect the adaptor sequencesfrom bisulfite conversion, dCTP in the nucleotide mix is substitutedwith methyl-dCTP during fill-in of 3′ library ends. These methylomelibraries are subjected to bisulfite conversion, and the convertedlibraries are amplified in a PCR reaction with a primer comprising theuniversal sequence and a thermostable polymerase. The amplifiedlibraries may be analyzed by any of a number of specific analyticalmethods for bisulfite-converted DNA known in the art, such asmethylation-specific PCR, sequencing, and quantitative PCR (MethyLight).The amplification of bisulfite-converted methylome libraries allowsgenomewide analysis of nanogram starting quantities ofbisulfite-converted DNA

In one specific embodiment (Example 35), there is a demonstration of theamplification of whole methylome libraries subjected to bisulfiteconversion. Libraries are prepared from unmethylated urine DNA byattachment of bisulfite-resistant adaptor Ku (Table VI), and an aliquotof that library is amplified using the universal Ku primer (SEQ ID NO:15). A separate aliquot of that library is amplified using the universalKu primer. FIG. 60A shows that approximately 30% of library moleculesare amplifiable after bisulfite conversion, based upon a comparison withunconverted library molecules. The bisulfite conversion of librarymolecules is confirmed by detecting converted DNA sequences in theamplified, converted methylome library but not the untreated methylomelibrary (FIG. 60B).

In a preferred embodiment, high sensitivity and specificity methylationanalysis is achieved by bisulfite conversion and amplification oflibraries enriched for methylated gene promoter regions bymethylationsensitive restriction digestion (such as in Examples 38 and39). For samples from sources such as serum, plasma, or urine where amajor fraction of DNA may originate from normal cells and cancer DNAconstitutes only a very small fraction (less than 1%, for example),amplification of enriched converted library molecules allows methylationanalyses, such as MethylLight, that are not possible with converted butnon-enriched DNA. The bisulfite treatment can also increase thespecificity for detecting methylated gene promoter regions in theenriched libraries by greatly reducing or even completely eliminatingnon-methylated DNA from the library that may be present due toincomplete digestion.

H. Methylation Analysis of Substantially Fragmented DNA Using MethylomeLibraries Enriched for CpG-rich DNA by Heating

In this embodiment, Methylome library synthesis employs methods foradditional enrichment of CpG-rich genomic DNA from substantiallyfragmented DNA. Methylome libraries as described in this application arevery powerful tools that permit the analysis of DNA methylation fromvery limited amounts and substantially fragmented samples such ascell-free DNA recovered from blood and urine, DNA isolated frombiopsies, and DNA isolated from formalin fixed paraffin embeddedtissues. When combined with real-time PCR analysis, as few as 2 or 3methylated DNA molecules can be detected in a blood or urine sample.This level of robustness and sensitivity presents opportunities formultiple non-invasive diagnostic applications of the Methylome librarymethod. Methylome libraries are characterized by a high degree ofcomplexity and the analysis of global methylation patterns may best beresolved by hybridization to high resolution DNA microarrays. Tomaximize the specificity and sensitivity of Methylome analysis anefficient enrichment method may be employed to increase the relativecopy number of CpG-rich DNA within the Methylome library. Previously,the present inventors described a novel enrichment method that appliedsecondary Methylome libraries and demonstrated that resulted in a16-128-fold enrichment level for the various methylated promoterregions. Secondary Methylome libraries demonstrate an increasedefficiency in identifying methylated CpG regions, however the complexsynthesis process may limit their application. Here we introduce analternative approach of Methylome library enrichment for the CpG-richgenomic regions which is much easier and faster than the secondaryMethylome library method, specifically, the thermo-enrichment method.

The Human genome has a broad distribution of base composition with mostsequences having around 42%GC (FIG. 71A). CpG-rich promoters are usuallycharacterized by significantly higher GC content ranging from 60 to 90%GC. The Thermo-enrichment method is based on differences in thethermo-stability of DNA fragments with different base composition. Athigh temperature all DNA molecules undergo a conformational transitioncalled denaturation or melting, which is characterized by unwinding ofdouble-stranded DNA structure and separation of DNA strands. It is wellknown in the art that DNA molecules with high GC content have highermelting temperature than molecules with low GC content. The meltingtemperature also depends on length of DNA fragments, concentration ofions in a buffer (characterized by ionic strength), pH, and the presenceor absence of additives such as dimethylsulfoxide, betaine, orformamide, for example.

When a heterogeneous but equimolar mixture of DNA fragments withdifferent base composition is exposed to increasing temperature thefragments with low GC content will denature before the fragments withhigh GC content. This results in different amounts of double-strandedmolecules for different DNA fractions, namely, practically the sameamount of double-stranded for highly GC-rich fragments, an intermediateamount of double-stranded for moderately GC-rich fragments, and a verylow low amount of double-stranded for highly AT-rich DNA (FIG. 71B).When a thermally-treated mixture of DNA blunt ended restrictionfragments is cooled back down to 37° C. and incubated with T4 DNA ligasein the presence of the blunt-end DNA adaptor and ATP, the adaptor isligated efficiently to only those DNA molecules that retained adouble-stranded conformation during thermal selection, specifically, themolecules with high GC content. The higher the temperarure that is usedfor thermo-treatment, the smaller the DNA fraction of sufficiently highGC content that remains double stranded and accepts adaptor in the bluntend ligation reaction. The selectivity of this method relies on kineticdifferencies of the DNA denaturation process for molecules withdifferent GC content and the ligation reaction preference fordouble-stranded DNA ends.

In one specific embodiment (Example 36 and FIG. 70B, FIG. 72A), aliquotsof blunt-end DNA fragments produced by Alu I digestion of human DNA werepre-heated for 10 min in 1 × NEBuffer 4 at 75° C. (control), 83° C.,84.1° C., 85,3° C., 87° C., 89.1° C., 91.4° C., 93.5° C., 94.9° C., 96°C., or 97° C., snap-cooled on ice, and incubated with T4 DNA ligase,K_(U) adaptor and ATP. After completion of the fill-in synthesis at therecessed 3′ ends (15 min at 75° C.), whole genome libraries wereamplified and then quantitatively analyzed using real-time PCR andprimer pairs for different promoter regions. It was found thatpre-heating DNA at temperatures between 89° C. and 94° C. resulted in 4to 128-fold (median about 60-fold) enrichment of the amplified WGAlibrary for all tested promoter regions.

In another specific embodiment (Example 37 and FIG. 70A, FIG. 72B),aliquots of cell-free DNA isolated from urine, and “polished” by Klenowfragment of DNA polymerase I, underwent thermo enrichment for 10 min in1× NEBuffer 4 at 75° C. (control), 89° C., 91° C., or 93° C.,snap-cooled on ice, and incubated with T4 DNA ligase, K_(U) adaptor andATP. Libraries were subsequently digested with the cocktail ofmethylation-sensitive restriction enzymes Aci I, HhaI, Hpa II, HinP1 I,and Bst UI, filled-in to replicate the sequence of the non-ligatedadaptor strand, and amplified by PCR. Real-time PCR analysis of two CpGislands within the amplified libraries revealed a significant enrichmentfor the thermo-enriched Methylome libraries with a maximum enrichmentlevel for these promoters observed in libraries prepared withpre-heating at 89° C. and 91° C.

A skilled artisan recognizes that selection for the GC-richdouble-stranded DNA fraction after pre-heating step can be done not onlybefore library amplification but also after library amplification,assuming that the universal PCR primer (primer K_(U) in the Exampledescribed above) has a phosphate group at the 5′ end generating ligationcompetent products. In this case (see FIG. 72D), enrichment can beachieved by heating the synthesized library amplification products to adesired melting temperature, cooling, ligating a new adaptor (or a pairof adaptors), and re-amplifying with primer(s) corresponding to thesecond adaptor(s). The fraction of amplified library that remaineddouble stranded during the thermo-enrichment process will accept thesecond adaptor(s) and represent the fragments corresponding to themelting temperature selected for enrichment.

A skilled artisan recognizes that selection for the GC-richdouble-stranded DNA fraction using a thermo-enrichment step can be donenot only by using a ligation reaction but also by using a “fill-in”polymerization reaction of the recessed adaptor ends. In this case theheating step occurs after the ligation step but preceeds the fill-instep. Only double-stranded DNA fragments with adaptor attached to the5′ends of DNA are competent templates for the extension reaction (FIG.72 C).

Thermo-enrichment of GC-rich DNA is a simple and rapid method forincreasing the sensitivity and specificity of Methylome libraries. Whenused in combination with the One-step Methylome library synthesis, itcan easily be implemented for high through-put methylation analysis ofclinical DNA samples for cancer diagnostics, and many other research andmedical areas. Thermo-enriched Methylome libraries may be used as themethod of choice for preparing enriched libraries for genome-widemethylation analysis.

III. EXAMPLES

The following examples are included to demonstrate preferred embodimentsof the invention. It should be appreciated by those of skill in the artthat the techniques disclosed in the examples that follow representtechniques discovered by the inventor to function well in the practiceof the invention, and thus can be considered to constitute preferredmodes for its practice. However, those of skill in the art should, inlight of the present disclosure, appreciate that many changes can bemade in the specific embodiments which are disclosed and still obtain alike or similar result without departing from the spirit and scope ofthe invention

Example 1 Design of Degenerate Pyrimidine Primers and Analysis ofSelf-priming and Extension

This example describes the comparison between primers of different basecomposition for their ability to prime a model DNA template and fortheir propensity to self-prime.

The model template oligonucleotide (SEQ ID NO: 9) was comprised of theT7 promoter sequence followed by 10 random purine bases at its3′-terminus. The reaction mixture contained 1× ThermoPol reaction buffer(NEB), 4 units of Bst DNA polymerase Large Fragment (NEB), 200 uM dNTPs,350 nM template primer 9, and 3.5 or 35 μM of self-inert degeneratepyrimidine Y and YN primers (SEQ ID NO: 1 through SEQ ID NO: 7) in afinal volume of 25 μl. Controls comprising no dNTPs are also includedfor each Y or YN primer. Samples were incubated for 5 min or 15 min at45° C. and stopped by adding 2 μl of 0.5 M EDTA. Aliquots of thereactions were analyzed on 10% TB-urea denaturing polyacrylamide gels(Invitrogen) after staining with SYBR Gold dye (Molecular Probes). FIG.5 shows the result of the comparison experiment. No evidence ofself-priming was found with primers having up to 3 random bases at their3′-end when applied at 35 μM concentration after 5 min incubation withBst polymerase and dNTPs at 45° C. (FIG. 5A). In contrast, in thesamples comprising template primer, a new band corresponding toextension products was observed at both 35 μM and 3.5 μM primersconcentration (FIG. 5B). In a separate analysis, degenerate pyrimidineprimers having up to six random bases at the 3′-end were analyzed fortheir ability to self-prime (FIG. 5C). After 15 min of incubation withBst polymerase, no extension products were observed with primers having3 random bases or less (FIG. 5C, lanes 1-3), whereas the primers withhigher complexity (N3 and above) showed progressively increasing amountof extension products (FIG. 5C, lanes 4-6). Control samples incubatedwith Bst polymerase but no dNTPs showed no extension products band (FIG.5C, lanes 7-12). See also U.S. patent application Ser. No. 10/795,667,filed Mar. 8, 2004, now U.S. Pat. No. 7,718,403.

Example 2 Whole Genome Amplification of Sodium Bisulfite-converted HumanDNA with Klenow Fragment of DNA Polymerase I

Human genomic DNA isolated by standard methods was treated with sodiumbisulfite using a modified procedure by Grunau et al (2001). Onemicrogram of genomic DNA in 20 μl of TE-L buffer (10 mM Tris-HCl, 0.1 mMEDTA, pH 7.5), with or without 5 μg of carrier tRNA (Ambion), was mixedwith 2.2 μl of 3.0 M NaOH and incubated at 42° C. for 20 minutes. Twohundred and forty microliters of freshly prepared sodium bisulfitereagent (5.41 g of NaHSO₃ dissolved in 8 ml distilled water and titratedto pH 5.0 with 10 N NaOH was mixed with 500 μl of 10 mM hydroquinone andfiltered through a 0.2 μm membrane filter) was added to the denaturedDNA samples and incubated for 4 hours at 55° C. The DNA was desaltedusing QIAEX (Qiagen) kit, recovered in 110 μl of TE-L buffer anddesulfonated with 12.1 μl of 3 M NaOH at 37° C. for 30 min. Afterdesulfonation the DNA was neutralized with 78 μl of 7.5 M ammoniumacetate, precipitated with 550 μl of absolute ethanol, washed twice with700 μl of 70% ethanol and air dried. The DNA was dissolved in 30 μl ofTE-L buffer and stored at −20° C. until use.

Sodium bisulfite-converted DNA was randomly fragmented in TE-L buffer byheating at 95° C. for 3 minutes. The reaction mixture contained 60 ng offragmented converted DNA in 1× EcoPol buffer (NEB), 200 μM of each dNTP,360 ng of Single Stranded DNA Binding Protein (USB), and either 0.5 μMeach of degenerate R(N)₂ and facilitating R_(U)(A)₁₀(N)₂ primers (SEQ IDNO: 10 and SEQ ID NO: 18) or 0.5 μM each of degenerate Y(N)₂ andselector Y_(U)(T)₁₀(N)₂ primers (SEQ ID NO: 3 and SEQ ID NO: 19) in afinal volume of 14 μl. After denaturation for 2 min at 95° C., thesamples were cooled to 24° C., and the reaction was initiated by adding5 units of the Klenow fragment of DNA polymerase I that lacks 3′-5′exonuclease activity (NEB). Library synthesis with converted DNA wascarried out at 24° C. for 1 hour. Control reactions containing 1 μM ofK(N)₂ primer (SEQ ID NO: 14) were also included with either 60 ng ofconverted or 5 ng of non-converted (wild type) genomic DNA. Reactionswere stopped with 1μl of 83 mM EDTA (pH 8.0), and samples were heatedfor 5 min at 75° C. The samples were further amplified by quantitativereal-time PCR by transferring the entire reaction mixture of the librarysynthesis reaction into a PCR reaction mixture containing a finalconcentration of the following: 1× Titanium Taq reaction buffer(Clontech), 200 μM of each dNTP, 100,000× dilutions of fluoresceincalibration dye and SYBR Green I (Molecular Probes), 1 μM of universalR_(U), Y_(U), or K_(U) primer (SEQ ID NO: 11, SEQ ID NO: 8, and SEQ IDNO: 15) with sequences identical to the known 5′ portion of therespective degenerate and facilitating primer, and 5 units of TitaniumTaq polymerase (Clontech) in a final volume of 75 Amplifications werecarried out for 13 cycles at 94° C. for 15 sec and 65° C. for 2 min onan I-Cycler real-time PCR instrument (Bio-Rad). FIG. 6 demonstrates that60 ng of bisulfite converted DNA amplifies equally to 5 ng ofnon-converted DNA when the former is amplified with degenerate R(N)₂ andfacilitating R_(U)(A)₁₀(N)₂ primers and the latter with K(N)₂ primers,respectively. FIG. 7 shows comparison between different degenerateprimer sequences supplemented with their corresponding selectorsequences for their ability to amplify bisulfite converted DNA. Thecombination of self-inert degenerate R(N)₂ and facilitatingR_(U)(A)₁₀(N)₂ primers was more than an order of magnitude better thanthe alternative combination of Y(N)₂ and facilitating Y_(U)(T)₁₀(N)₂primers (FIGS. 7A and B). On the other hand, control K(N)₂ degenerateprimer designed to target non-converted DNA, amplified bisulfiteconverted DNA approximately one additional order of magnitude lessefficiently (FIG. 7C).

DNA samples amplified from bisulfite-converted DNA using degenerateR(N)₂ and facilitating R_(U)(A)₁₀(N)₂ primers or non-converted DNAamplified using control K(N)₂ degenerate primer were purified byQIAQUICK® PCR kit (Qiagen) using the manufacturer's protocol. Tennanograms of each amplification reaction were further analyzed for aspecific genomic marker (STS sequence RH93704, UniSTS database, NationalCenter for Biotechnology Information) with primer pairs specific fornon-converted DNA (SEQ ID NO: 20 and SEQ ID NO: 21) or specific forbisulfite-converted DNA (SEQ ID NO: 22 and SEQ ID NO: 23) byquantitative real-time PCR. The PCR reaction mixture comprised thefollowing: 1× Titanium Taq reaction buffer (Clontech), 200 μM of eachdNTP, fluorescein calibration dye (1:100,000) and SYBR Green I(1:100,000), 200 nM of each forward and reverse primer, 5 units ofTitanium Taq polymerase (Clontech), and 10 ng Template DNA in a finalvolume of 50 μl. Reactions were carried out for 40 cycles at 94° C. for15 sec and 65° C. for 1 min on an I-Cycler real-time PCR instrument(BioRad). FIG. 8 shows that approximately two orders of magnitudedifference exists in the amplification of the genomic marker using PCRprimers specific for converted or non-converted DNA with matched versusmismatched WGA amplified DNA as the template.

Example 3 Optimization of the Cleavage of Human Genomic DNA with McrbcNuclease

This example describes the optimization of conditions for McrBC cleavagenecessary to generate various levels of digestion of human genomic DNA.

In order to generate partially digested McrBC libraries, the rate ofMcrBC cleavage was investigated by varying the amount of McrBC utilizedfor digestion. DNA (100 ng) in 7 ul TE-Lo (10 mM Tris, 0.1 mM EDTA, pH7.5) was added to a master mix containing 1 mM GTP, 100 μg/ml BSA, 1× T4DNA Ligase Buffer, and H₂O. Subsequently, 1 μl of the appropriate amountof McrBC (0, 0.02, 0.04, 0.06, 0.08, 0.10 U) was added to each tube andincubated at 37° C. for 1 hour, followed by inactivation of the enzymeat 75° C. for 15 minutes and cooling to 4C.

Universal GT adaptor was assembled in 10 mM KCl containing 20 μM Ku (SEQID NO: 15) and 20 μM GT short (SEQ ID NO: 54) (Table I) to form a bluntend adaptor. The adaptor was ligated to the 5′ ends of the DNA using T4DNA ligase by addition of 0.6 ul 10× T4 DNA ligase buffer, 2.4 ul H₂O, 2μl GT adaptor (10 pmol) and 1 μl T4 DNA Ligase (2,000 U). The reactionwas carried out for 30 minutes at 16° C., the enzyme was inactivated at75° C. for 10 minutes, and the samples were held at 4° C. until use.Alternatively, the libraries can be stored at −20° C. for extendedperiods prior to use.

Extension of the 3′ end to fill in the universal adaptor and subsequentamplification of the library were carried out under the same conditions.Five nanograms of library or H₂O (No DNA control) was added to a 25 μlreaction comprising 25 pmol T7-C₁₀ universal primer (SEQ ID NO: 36), 200μM of each dNTP, 1× PCR Buffer (Clontech), 1× Titanium Taq. Fluoresceincalibration dye (1:100,000) and SYBR Green I (1:100,000) are also addedto allow monitoring of the reaction using an I-Cycler Real-Time PCRDetection System (Bio-Rad). The samples are initially heated to 75° C.for 15 minutes to allow extension of the 3′ end of the fragments to fillin the universal adaptor sequence and displace the short blockedfragment of the universal adaptor. Subsequently, amplification iscarried out by heating the samples to 95° C. for 3 minutes 30 seconds,followed by 18 cycles of 94° C. 15 seconds, 65° C. 2 minutes. Theamplification curves for all 3 samples are depicted in FIG. 9A. Theamplification curves indicate decreased library generation andamplification with decreasing amounts of McrBC. Amplification of the NoMcrBC control indicates that a subset (<1%) of molecules in the genomicprep were of the appropriate size for library preparation. Plotting thecycle # at 50% of the max RFU versus McrBC quantity results in a sigmoidrelationship (FIG. 9B). It should be noted that addition of greater than0.1 U McrBC does not result in any increase in library generation oramplification. If the difference in cycles between the 0.1 U McrBClibrary and the other libraries is assumed to represent 1doubling/cycle, then the effective % of McrBC digestion can becalculated. The resulting graph (FIG. 9C) indicates that small changesin the concentration of McrBC result in significant decreases in theamount of cleavage that occurs. Specifically, 0.07 U of McrBC arerequired to generate a 50% cleavage rate. Additional experiments haveindicated that shortening the duration of McrBC incubation can alsoreduce the level of cleavage, although this reaction is lessreproducible and more difficult to control.

In order to investigate and optimize further the conditions for McrBCdigestion, additional experiments were performed using different amountsof enzyme as well as different temperatures of incubation and theresulting fragments were analyzed by field inversion gelelectrophoresis.

Genomic DNA purchased from the Coriell Institute for Medical Research(repository # NA14657) was used as template for McrBC cleavage. Aliquotsof 500 ng of DNA were cleaved with McrBC in 15 μl of 1× NEBuffer 2containing 100 μg/ml BSA, 1 mM GTP, and 0, 2, 5, or 10 units of McrBCnuclease (NEB) at 37° C. for 90 min, followed by incubation at 65° C.for 20 minutes to inactivate the enzyme. In another set of samples, 10units of McrBC were used as described above, but the incubation was at16° C., 25° C., or 37° C.

To prevent potential gel retardation due to rehybridization ofoverhangs, samples cleaved with different amounts of McrBC were eitherleft untreated or incubated with 5 units of Taq polymerase and 200 μM ofeach dNTP at 65° C. for 1 minute to fill-in any recessed 3′ ends.Samples were then heated at 75° C. for 1 minute and analyzed on a 1%pulse-field agarose gel using Field Inversion Gel electrophoresis System(BioRad) preset program 2 for 14 hours in 0.5× TBE buffer. The gel wasstained with SYBR Gold (Molecular Probes). FIG. 10 shows thedistribution of fragments obtained after McrBC cleavage. After digestionusing 10 units of McrBC at 37° C., the average apparent size offragments generated from human genomic DNA was approximately 7 Kb andthe range was from less than 1 Kb to about 30 Kb. Reducing thetemperature or reducing the amount of enzyme resulted in less completecleavage but the size distribution of fragments was similar. As evidentfrom the figure, changing the temperature of incubation is a moreefficient way of controlling the level of cleavage as compared tochanging the amount of enzyme. The present inventors attribute this, inspecific embodiments, to the necessity to maintain certain stoicheometrybetween the subunits of the nuclease and the template.

Example 4 Gel Fractionation of Mcrbc Cleavage Products and Analysis ofthe Segregation of Sites Internal to, or Flanking, Promoter Cpg Islands

This example describes the analysis of the segregation of McrBC cleavageproducts along an agarose gel as a function of CpG methylation.

One microgram of exemplary control genomic DNA (Coriell repository #NA14657) or exemplary KG1-A leukemia cell DNA was subjected to completedigestion with McrBC nuclease in 25 μl of 1× NEBuffer 2 (NEB) containing100 μg/ml BSA, 1 mM GTP, and 10 units of McrBC nuclease (NEB) at 37° C.for 90 minutes, followed by incubation at 65° C. for 20 minutes toinactivate the enzyme. Samples were extracted with phenol: chloroform:isoamyl alcohol (25: 24: 1) to prevent gel retardation, precipitatedwith ethanol, and dissolved in 15 μl of TE-L buffer.

Samples were loaded on a 15 cm long 1% agarose gel, electrophoresed at5V per cm in a modified TAE buffer (containing 0.5 mM EDTA), and stainedwith SYBR Gold (Molecular Probes). Gel lanes were sliced into segmentsof 0.75 cm, each corresponding approximately to the following sizesbased on molecular weight markers: 7.5 to 12 Kb, 4.5 to 7.5 Kb, 3.0 to4.5 Kb, 2.0 to 3.0 Kb, 1.5 to 2.0 Kb, 1.0 to 1.5 Kb, 0.65 to 1.0 Kb, 0.4to 0.65 Kb, 0.25 to 0.4 Kb, and 0.05 to 0.25 Kb. DNA was extracted withUltrafree DA centrifugal devices (Millipore) at 5,000 × g for 10 minutesand 10 μl was used as template for amplification using primers specificfor sites internal to, or flanking, promoter CpG islands. Primer pairswere used as follows: p15 promoter (SEQ ID NO: 24 forward and SEQ ID NO:25 reverse), p16 promoter (SEQ ID NO: 26 forward and SEQ ID NO: 27reverse), E-Cadherin promoter (SEQ ID NO: 28 forward and SEQ ID NO: 29reverse) for sites internal to CpG islands, and p15 promoter (SEQ ID NO:46 forward and SEQ ID NO: 47 reverse), p16 promoter (SEQ ID NO: 48forward and SEQ ID NO: 49 reverse), or E-Cadherin promoter (SEQ ID NO:52 forward and SEQ ID NO: 53 reverse) for sites flanking the CpGislands. PCR amplification was carried out in a reaction mixturecomprising 1× Titanium Taq reaction buffer (Clontech), 200 μM of eachdNTP, 4% DMSO, fluorescein calibration dye (1:100,000), SYBR Green I(1:100,000), 200 nM each forward and reverse primer, and 5 units ofTitanium Taq polymerase (Clontech) in a final volume of 50 μl at 94° C.for 15 seconds and 68° C. for 1 minute for a varying number of cyclesuntil a plateau was reached in the amplification curves. Ten microlitersof each PCR reaction were analyzed on 1% agarose gel stained withethidium bromide.

FIG. 11 shows distribution plots of the gel fractions against thereciprocal of the threshold amplification cycle for each real-time PCRcurve as well as the PCR products separated on agarose gel. All of theamplified sites were shifted toward lower molecular weight fractions incancer versus normal cells indicating that hypermethylated regions incancer cells are digested extensively by McrBC nuclease. On the otherhand, the methylation signal of the E-Cadherin promoter was found in theintermediate size fractions indicating that the regions flanking thispromoter in normal cells are heavily methylated and thus are cleaved byMcrBC nuclease, thereby generating more background as compared to theother two gene promoters studied. The broad size distribution of thesmaller products in DNA from cancer cells can be explained by trappingof DNA in agarose gels causing retardation and trailing of the peakstoward an apparent higher molecular weight (E. Kamberov, unpublishedobservation).

Example 5 Analysis of the Methylation Status of Promoter Cpg Islands byCleavage with Mcrbc Nuclease Followed by Pcr Amplification

This example describes a simple McrBC-mediated direct assay formethylation of CpG promoter islands based on the ability of the McrBCnuclease to cleave between two methylated cytosines. The cleavagereaction between sites flanking multiple methylated cytosines results ina lack of PCR amplification from the priming sites and generates anegative signal for methylation.

Genomic DNA purchased from the Coriell Institute for Medical Research(repository # NA16028) was used as a negative control for CpG islandmethylation. The same source of DNA was also fully methylated using SsslCpG Methylase to serve as a positive control. Genomic DNA from exemplaryKG1-A leukemia cells purified by a standard procedure was used as a testsample for CpG island promoter hypermethylation. Coriell NA16028 gDNAwas methylated with 4 units of SssI CpG Methylase (NEB) in 50 μlaccording to the manufacturer's protocol, to serve as a positive controlfor methylation.

McrBC cleavage of control DNA, Sssl methylated DNA, or KG1-A test DNAwas performed in 50 μl of 1× NEBuffer 2 containing 5 μg of DNA, 100μg/ml BSA, 1 mM GTP, and 35 units of McrBC nuclease (NEB) at 37° C. for90 minutes, followed by incubation at 65° C. for 20 minutes toinactivate the enzyme.

Five nanogram aliquots of each McrBC digested sample or controlnon-digested DNA were amplified by quantitative real-time PCR inreaction mixture containing 1× Titanium Taq reaction buffer (Clontech),200 μM of each dNTP, fluorescein calibration dye (1:100,000) and SYBRGreen I (1:100,000), 200 nM of primers specific for CpG regions of thefollowing promoters: p15 (Accession # AF513858) p16 (Accession#AF527803), E-Cadherin (Accession # AC099314), and GSTP-1 (Accession #M24485) (SEQ ID NO: 24+ SEQ ID NO: 25, SEQ ID NO: 26 + SEQ ID NO: 27,SEQ ID NO: 28+ SEQ ID NO: 29, and SEQ ID NO: 30+ SEQ ID NO: 31respectively), 4% DMSO, and 2 units of Titanium Taq polymerase(Clontech) in a final volume of 30 μl. Amplifications were carried outat 94° C. for 15 seconds and 65° C. for 1 minute on an I-Cyclerreal-time PCR instrument (Bio-Rad) for a varying number of cycles untila plateau was reached on the amplification curves of the negativecontrols. Ten microliters of each PCR reaction were analyzed on 1%agarose gel after staining with ethidium bromide.

FIG. 12 shows the result of the promoter methylation analysis. Afterdigestion with McrBC, fully methylated control DNA displayed completelack of amplification for all four promoter sites, whereas control DNAamplified normally with or without McrBC cleavage. The test cancer DNAfrom KG1-A leukemia cells showed strong hypermethylation in three out ofthe four analyzed promoters.

Example 6 Analysis of Dna Methylation by One-sided Pcr from McrbcCleavage Sites

This example describes development of a McrBC-mediated librarydiagnostic assay for promoter CpG island hypermethylation based onligation of a universal adaptor to McrBC cleavage sites followed byincorporation of a poly-C tail allowing one-sided PCR between thehomopolymeric sequence and a specific site flanking the CpG island.

Five micrograms of control genomic DNA (Coriell repository # NA16028) orgenomic DNA from exemplary KG1-A leukemia cells were digested with McrBCnuclease in 50 μl of 1× NEBuffer 2 containing 100 μg/ml BSA, 1 mM GTP,and 35 units of McrBC nuclease (NEB) at 37° C. for 90 min, followed byincubation at 65° C. for 20 min to inactivate the enzyme.

In the next step, a universal T7 promoter sequence was ligated to McrBCcleavage fragments that were polished, following cleavage, to produceblunt ends. Aliquots of 100 ng of each sample were blunt-ended withKlenow fragment of DNA polymerase I (USB) in 10 μl of 1× T4 Ligasebuffer (NEB) containing 2 nM of each dNTP at 25° C. for 15 minutes.Universal T7 adaptors were assembled in 10 mM KCl containing 10 μM 20 μMT7GG (SEQ ID NO: 32) and 20 μM T7SH (SEQ ID NO: 34) to form a blunt endadaptor; 20 μM T7GG (SEQ ID NO: 32) and 40 μM of T7NSH (SEQ ID NO: 35)to form a 5′ N overhang adaptor; and 20 μM T7GGN (SEQ ID: 33) and 40 μMof T7SH (SEQ ID NO: 34) to form a 3′ N overhang adaptor (see Table I forexemplary oligonucleotide sequences). Adaptor mixtures were heated at65° C. for 1 minute, cooled to room temperature and incubated for 5 minon ice. The tubes were combined in 2:1:1 ratio (blunt end: 5′ Noverhang: 3′ N overhang) and kept on ice prior to use. Ligationreactions were performed in 16 μl of 1× T4 Ligase buffer (NEB),containing 100 ng of blunt-end template DNA, 3.75 μM final concentrationof T7 adaptors, and 2,000 units of T4 DNA Ligase (NEB) at 16° C. for 1hour, followed by incubation at 75° C. for 10 minutes to inactivate theligase.

Next, homo-polymeric extensions were incorporated at the ends of thefragments using a primer T7-C10 (SEQ ID NO: 36) comprising ten C basesat the 5′ end followed by a 3′ T7 promoter sequence. This sequenceallows asymmetric one-sided PCR amplification due to the strongsuppression effect of the terminal poly-G/poly-C duplex making theamplification between the terminal inverted repeats very inefficient(U.S. patent application Ser. No. 10/293,048, filed Nov. 13, 2002, nowU.S. Pat. No. 7,655,791; U.S. patent application Ser. No. 10/797,333,filed Mar. 8, 2004, published as U.S. Patent Application PublicationNo.: 2004/0209299 and is now abandoned; and U.S. patent application Ser.No. 10/795,667, filed Mar. 8, 2004, now U.S. Pat. No. 7,718,403). PCRamplification was carried out by quantitative real-time PCR in reactionmixture comprising 1× Titanium Taq reaction buffer (Clontech), 5 ng ofMcrBC library DNA with ligated universal T7 adaptors, 200 μM of eachdNTP, 200 μM of 7-deaza-dGTP (Sigma), 4% DMSO, 1:100,000 dilutions offluorescein and SYBR Green I (Molecular Probes), 1μM T7-C₁₀ primer (SEQID NO: 36), and 5 units of Titanium Taq polymerase (Clontech) in a finalvolume of 50 μl. Amplification was carried out at 72° C. for 10 min tofill-in the 3′-recessed ends, followed by 18 cycles at 94° C. for 15 secand 65° C. for 2 min on an I-Cycler real-time PCR instrument (BioRad).Samples were purified on Quiaquick PCR purification filters (Qiagen).

To analyze the methylation status of promoters CpG islands, one-sidedPCR was performed using 50 ng of purified McrBC library DNA from normalor cancer cells, a universal C₁₀ primer comprising ten C bases, andprimers specific for regions flanking the CpG islands of differentexemplary promoters implicated in epigenetic control of carcinogenesis.PCR amplification was carried by quantitative real-time PCR in areaction mixture comprising 1× Titanium Taq reaction buffer (Clontech),200 μM of each dNTP, 4% DMSO, 1:100,000 dilutions of fluoresceincalibration dye and SYBR Green I (Molecular Probes), 200 nM C₁₀ primer(SEQ ID NO: 38), 200 nM of primer specific for p15 promoter (SEQ ID NO:39), p16 promoter (SEQ ID NO: 40), or E-Cadherin promoter (SEQ ID NO: 41or SEQ ID NO: 42), and 3.5 units of Titanium Taq polymerase (Clontech)in a final volume of 35 Amplification was at 94° C. for 15 seconds and68° C. for 1 minute on an I-Cycler real-time PCR instrument (Bio-Rad)for different number of cycles until a plateau was reached for thecancer DNA samples. Ten microliters of each PCR reaction were analyzedon 1% agarose gel after staining with ethidium bromide.

FIG. 13 shows that a positive signal was generated from thehypermethylated cancer DNA CpG islands. Among the promoters studied, thep15 promoter had the highest ratio of cancer vs. normal signal, followedby the p16 promoter. The E-Cadherin gene promoter on the other hand,showed a very slight difference between cancer and normal DNA and when aprimer specific for a region flanking the E-Cadherin CpG island on its3′ end was used, the assay produced an inverse signal (i.e., positivefor normal and negative for cancer) that the present inventors inspecific embodiments interpret as interference coming from methylatedregions flanking the CpG islands in the 3′ direction. The transcribedregions adjacent to the 3′ end of most CpG islands in normal cells areknown to be heavily methylated, whereas, for promoters involved inepigenetic control of carcinogenesis in cancer cells, these regions arelargely hypomethylated (Baylin and Herman, 2000).

To determine the sensitivity limits of the assay, different ratios ofMcrBC libraries prepared from normal or cancer cells as described abovewere mixed and then amplified with the universal C₁₀ primer (SEQ ID NO:38) and a primer specific for the p15 promoter 5′ flanking region (SEQID NO: 39). The total amount of DNA was 50 ng per amplification reactioncontaining 0, 0.1, 1.0, 10, 50, or 100% of cancer DNA. One-sided PCRamplification was done as described above. The result of this experimentshowed that as little as 0.1% of cancer DNA can be detected in abackground of 99.9% normal DNA corresponding to 1 cancer cell in about1000 normal cells (FIG. 14).

Example 7 Preparation of Nick-translation DNA Libraries from FragmentsOriginating at Mcrbc Cleavage Sites for Analysis of DNA Methylation

In this example, a McrBC-mediated library diagnostic assay is describedin which a nick-attaching biotinylated adaptor is ligated to McrBCcleavage sites, the nick is propagated to a controlled distance from theadaptor and the uniformly sized nick-translation products areimmobilized on a solid phase and analyzed for the presence of sequencesinternal to, or flanking, a CpG island. The McrBC libraries of this typecan also be used for discovery of unknown hypermethylated promoters bysequencing or by hybridization to micro arrays.

One microgram of control genomic DNA (Coriell repository # NA16028) orKG1-A leukemia cells DNA was subjected to limited digestion with McrBCnuclease in 25 μl of 1× NEBuffer 2 (NEB) containing 100 μg/ml BSA, 1 mMGTP, and 2 units of McrBC nuclease (NEB) at 37° C. for 1 hour, followedby incubation at 65° C. for 20 minutes to inactivate the enzyme.

The ends of the digested fragments were blunt-ended with the Klenowfragment of DNA polymerase I (USB) in 100 μl of 1× T4 Ligase buffer(NEB) with 2 nM of each dNTP at 25° C. for 15 minutes followed byblunt-end ligation of biotinylated nick-attaching adaptor. The adaptorwas assembled in 10 mM KCl containing 18 μM Adapt Backbone (SEQ ID NO:43), 15 μM Adapt Biot (SEQ ID NO: 44), and 15 μM Adapt Nick (SEQ ID NO:45) (Table I) by heating at 95° C. for 1 minute, cooling to roomtemperature, and incubation for 5 min on ice. Ligation reactions wereperformed in 160 μl of 1× T4 Ligase buffer (NEB), containing 1μg ofblunt-end template DNA, 3.75 μM of biotinylated nick-attaching adaptor,and 20,000 units of T4 DNA Ligase (NEB) at 16° C. for 1 hour, followedby incubation at 75° C. for 10 minutes to inactivate the ligase. Sampleswere purified on Quiaquick PCR filters (Qiagen) and reconstituted in 70μl of TE-L buffer.

Samples were further subjected to nick-translation in total of 100 μl of1× ThermoPol buffer (NEB) containing 200 μM of each dNTP and 5 units ofTaq polymerase (NEB) at 50° C. for 5 minutes. Reactions were stopped byadding 5 μl of 0.5 M EDTA, pH 8.0.

Nick-translation products were denatured at 100° C. for 5 minutes,snap-cooled on ice and mixed with 300 μg M-280 streptavidin paramagneticbeads (Dynal) in equal volume of 2x binding buffer containing 20 mMTris-HCl, pH 8.0, 1 M LiCl, and 2 mM EDTA. After rotating the tubes for30 minutes at room temperature, the beads were washed 4 times with 70 μlof TE-L buffer, 2 times with 70 μl of freshly prepared 0.1 N KOH, and 4times with 80 μl of TE-L buffer. The beads were resuspended in 50 μl ofTE-L buffer and stored at 4° C. prior to use.

Two microliters of streptavidin beads suspension were used to amplifyspecific regions flanking promoter CpG islands from libraries preparedfrom DNA of normal or cancer cells. To prevent fluorescence quenching,PCR library synthesis was carried out in a reaction mixture containing1× Titanium Taq reaction buffer (Clontech), 200 μM of each dNTP, 4%DMSO, 200 nM each forward and reverse primer specific for p15 promoter(SEQ ID NO: 46 forward and SEQ ID NO: 47 reverse), p16 promoter (SEQ IDNO: 48 forward and SEQ ID NO: 49 reverse), or E-Cadherin promoter (SEQID NO: 50 forward and SEQ ID NO: 51 reverse), and 5 units of TitaniumTaq polymerase (Clontech) in a final volume of 50 μl at 95° C. for 3minutes followed by 10 cycles at 94° C. for 15 seconds and 68° C. for 1minute. After removal of beads and addition of 1:100,000 dilutions offluorescein calibration dye and SYBR Green I (Molecular Probes),amplification was continued at 94° C. for 15 seconds and 68° C. for 1minute on an I-Cycler real-time PCR instrument (Bio-Rad) for varyingnumber of cycles until a plateau was reached for the cancer DNA samples.Ten microliters of each PCR reaction were analyzed on 1% agarose gelstained with ethidium bromide.

FIG. 15 shows the results of the methylation analysis. The positivesignal generated from hypermethylated p15 and p16 promoters in KG1-Acancer cells was equally strong, while the signal for E-Cadherin wasweaker, but still clearly distinguishable from the signal amplified fromnormal cells.

In order to produce a sufficient amount of DNA for analysis of multiplepromoter sites and for micro-array hybridization the present inventorsstudied the possibility of amplification of the McrBC librariesdescribed above using a method for whole genome amplification withself-inert degenerate primers as described (U.S. patent application Ser.No. 10/795,667, filed Mar. 8, 2004, now U.S. Pat. No. 7,718,403).Seventeen microliters of streptavidin beads suspension of each librarywere resuspended in 14 μl of 1× EcoPol buffer (NEB), 200 μM of eachdNTP, 1 uM degenerate K(N)₂ primer (SEQ ID NO: 14), 15 ng/μl and 4%DMSO. After a denaturing step of 2 minutes at 95° C., the samples werecooled to 24° C., and the reaction was initiated by adding 5 units ofKlenow Exo- (NEB). The library synthesis reactions were carried out at24° C. for 1 hr. Reactions were stopped with 1 μl of 83 mM EDTA (pH8.0), and the samples were heated for 5 minutes at 75° C. The entirereaction mixture was further amplified by real-time PCR in a 75 μlvolume containing 1× Titanium Taq reaction buffer (Clontech), 200 μM ofeach dNTP, fluorescein calibration dye (1:100,000) and SYBR Green I(1:100,000), 1 μM universal K_(U) primer (SEQ ID NO: 15), 4% DMSO, 200μM 7-deaza-dGTP (Sigma), and 5 units of Titanium Taq polymerase(Clontech). Reactions were carried at 94° C. for 15 seconds and 65° C.for 2 minutes on an I-Cycler real-time PCR instrument (Bio-Rad) fordifferent number of cycles until reaching a plateau. After cleaning thesamples by QIAQUICK® PCR filters (Qiagen) 10 ng of amplified normal orcancer DNA were used to amplify specific regions flanking promoter CpGislands. PCR amplification was carried in reaction mixture containing 1×Titanium Taq reaction buffer (Clontech), 200 μM of each dNTP, 4% DMSO,fluorescein calibration dye (1:100,000) and SYBR Green I (1:100,000),200 nM each forward and reverse primer specific for p15 promoter (SEQ IDNO: 46 forward and SEQ ID NO: 47 reverse), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 50 μl at 95° C. for 3 minutesfollowed by 94° C. for 15 sec and 68° C. for 1 minute for differentnumber of cycles until a plateau was reached for the cancer samples. Tenmicroliters of each PCR reaction were analyzed on 1% agarose gel afterstaining with ethidium bromide.

FIG. 16 shows the amplification of a sequence flanking the CpG island ofthe p15 promoter in normal and cancer cells. The difference inmethylation between the two samples was similar to that found innon-amplified libraries (see FIG. 15). This demonstrates that sufficientamounts of DNA can be generated for analysis of methylation in multiplepromoters as well as for discovery of unknown hypermethylated promoters.

Example 8 Preparation of DNA Libraries from Fragments Originating atMcrbc Cleavage Sites by Direct Biotin Incorporation for Analysis of DNAMethylation

In this example a McrBC-mediated library diagnostic assay is describedin which 3′ recessed ends of McrBC cleavage sites are extended in thepresence of a biotin-comprising nucleotide analog, followed by DNAfragmentation, immobilization on a solid support, and analysis ofsequences internal to, or flanking, a CpG island. The McrBC libraries ofthis type can also be used for discovery of unknown hypermethylatedpromoters by sequencing or by hybridization to microarrays, for example.

One microgram of control genomic DNA (Coriell repository # NA16028) orKG1-A leukemia cells DNA was subjected to limited digestion with McrBCin 25 μl of 1× NEBuffer 2 (NEB) containing 100 μg/ml BSA, 1 mM GTP, and2 units of McrBC nuclease (NEB) at 37° C. for 1 hour, followed byincubation at 65° C. for 20 minutes to inactivate the enzyme.

The 3′ recessed ends of the DNA fragments were extended with Klenowfragment of DNA polymerase I in 100 μl of 1× T4 Ligase buffer (NEB) with20 nM each of dATP, dCTP, and dGTP, 25 nM Biotin-21-dUTP (Clontech), and6 units of the Klenow Exo-(USB) at 25° C. for 20 minutes followed by 75°C. for 10 minutes. After QIAQUICK® clean-up (Qiagen) the samples wererecovered in 70 μl of TE-L buffer.

The labeled DNA was fragmented by heating at 95° C. for 4 minutes,snap-cooled on ice for 2 minutes, and mixed with 300 μg M-280streptavidin paramagnetic beads (Dynal) in equal volume of 2× bindingbuffer containing 20 mM Tris-HCl, pH 8.0, 1 M LiCl, and 2 mM EDTA. Afterrotating the tubes for 1 hour at room temperature, the beads were washed3 times with 80 μl of TE-L buffer, 1 time with 70 μl of freshly prepared0.1 N KOH, and 4 times with 80 μl of TE-L buffer. The beads wereresuspended in 50 μl of TE-L buffer and stored at 4° C. prior to use.

Two microliters of streptavidin beads suspension were used to amplifyspecific regions flanking promoter CpG islands from libraries preparedfrom DNA of normal or cancer cells. To prevent fluorescence quenching,PCR library synthesis was carried out in a reaction mixture comprising1× Titanium Taq reaction buffer (Clontech), 200 μM of each dNTP, 4%DMSO, 200 nM each forward and reverse primer specific for the human p15promoter (SEQ ID NO: 46 forward and SEQ ID NO: 47 reverse), and 5 unitsof Titanium Taq polymerase (Clontech) in a final volume of 50 μl at 95°C. for 3 minutes followed by 10 cycles at 94° C. for 15 sec and 68° C.for 1 minute. After removal of beads and addition of 1 : 100,000dilutions of fluorescein calibration dye and SBYR Green I (MolecularProbes), amplification was continued at 94° C. for 15 seconds and 68° C.for 1 min on an I-Cycler real-time PCR instrument (Bio-Rad) fordifferent number of cycles until a plateau was reached for the cancerDNA samples. Ten microliters of each PCR reaction were analyzed on 1%agarose gel after staining with ethidium bromide.

FIG. 17 shows the results of the methylation analysis. A strong positivesignal was generated from hypermethylated p15 promoters in KG1-A cancercells, but not from control cells.

In order to produce sufficient DNA for analysis of multiple promotersites and for micro-array hybridization, we tested the possibility ofamplification of the McrBC libraries described above using our patentedmethod for whole genome amplification with self-inert degenerate primers(U.S. patent application Ser. No. 10/795,667, filed Mar. 8, 2004, nowU.S. Pat. No. 7,718,403). Seventeen microliters of streptavidin beadssuspension of each library were resuspended in 14 μl of 1× EcoPol buffer(NEB), 200 μM of each dNTP, 1 uM degenerate K(N)₂ primer (SEQ ID NO:14), 15 ng/μl, and 4% DMSO. After a denaturing step of 2 minutes at 95°C., the samples were cooled to 24° C., and the reaction was initiated byadding 5 units of Klenow Exo-(NEB). The library synthesis reactions weredone at 24° C. for 1 hour. Reactions were stopped with 1 μl of 83 mMEDTA (pH 8.0), and samples were heated for 5 minutes at 75° C. Theentire reaction mixture was further amplified by quantitative real-timePCR in 75 μl volume containing 1× Titanium Taq reaction buffer(Clontech), 200 μM of each dNTP, fluorescein calibration dye (1:100,000)and SYBR Green I (1:100,000), 1 μM universal K_(U) primer (SEQ ID NO:15), 4% DMSO, and 5 units of Titanium Taq polymerase (Clontech).Reactions were carried at 94° C. for 15 sec and 65° C. for 2 min on anI-Cycler real-time PCR instrument (BioRad) for various number of cyclesuntil reaching a plateau. After cleaning the samples by QIAQUICK® PCRfilters (Qiagen) 10 ng of amplified normal or cancer DNA were used toamplify specific regions flanking promoter CpG islands. PCRamplification was carried in reaction mixture containing 1× Titanium Taqreaction buffer (Clontech), 200 μM of each dNTP, 4% DMSO, fluoresceincalibration dye (1:100,000) and SYBR Green I (1:100,000), 200 nM each offorward and reverse primer specific for p15 promoter (SEQ ID NO: 46forward and SEQ ID NO: 47 reverse), p16 promoter (SEQ ID NO: 48 forwardand SEQ ID NO: 49 reverse), or E-Cadherin promoter (SEQ ID NO: 50forward and SEQ ID NO: 51 reverse), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 50 μl at 95° C. for 3 minfollowed by 94° C. for 15 sec and 68° C. for 1 min for different numberof cycles until a plateau was reached for the cancer samples. Tenmicroliters of each PCR reaction were analyzed on 1% agarose gel afterstaining with ethidium bromide.

FIG. 18 shows the amplification of sequences flanking the CpG island ofp15, p16, and E-Cadherin promoters in normal and cancer cells. Thedifference in methylation between the cancer and control samples wassimilar to that found in non-amplified libraries (see FIG. 17). Thisdemonstrates that sufficient amounts of DNA can be generated foranalysis of methylation in multiple promoters as well as for discoveryof unknown hypermethylated promoters.

Example 9 Analysis of the Termini Produced by Mcrbc and Direct Ligationof Adaptors with 5′-overhangs to Mcrbc Cleavage Sites without PriorEnzymatic Repair

This example describes the analysis of the nature of the DNA endsproduced by McrBC nuclease digestion. It also shows that the endsproduced by McrBC cleavage are directly competent for ligation toadaptors having random 5′-overhangs without any further enzymatic repairto adaptors and defines the minimum length of these overhangs.

In order to investigate the characteristics of McrBC cleavage, severalexperiments were conducted to determine the types of ends that aregenerated. Initial experiments compared the ability of McrBC digestedDNA to have universal adaptors ligated to the resulting ends and beamplified. Specifically, 1 μg of genomic DNA was digested in thepresence of 0.1 U of McrBC overnight at 37° C.

The requirement of polishing for ligation to the resulting 3′ ends ofthe digested DNA was investigated by comparing polishing with Klenow andExo-, and No Polishing. Specifically, 1.1 μl 10× T4 DNA ligase buffer,0.02 μl dNTP (200 nM FC) and 0.84 μl H₂O were added to 8 μl offragmented DNA (100 ng). Finally, 0.04 μl of H₂O, Klenow (2.3 U, NEB) orKlenow Exo-(2.3 U, NEB) were added to the appropriate tubes. Thereaction was carried out at 25° C. for 15 minutes, and the polymerasewas inactivated at 75° C. for 15 minutes and then chilled to 4° C.

Universal T7 adaptors were assembled in 10 mM KCl containing 10 μM T7GG(SEQ ID NO: 32) and 20 μM T7SH (SEQ ID NO: 34) to form a blunt endadaptor; 20 μM T7GG (SEQ ID NO: 32) and 40 μM of T7NSH (SEQ ID NO: 35)to form a 5′ N overhang adaptor; and 20 μM T7GG (SEQ ID NO: 33) and 40μM of T7GGN (SEQ ID NO: 34) to form a 3′ N overhang adaptor (see Table Ifor oligonucleotide sequences). Adaptor mixtures were heated at 65° C.for 1 minute, cooled to room temperature and incubated for 5 min on iceprior to use. T7 adaptors were ligated to the 5′ ends of the DNA usingT4 DNA ligase by addition of 0.5 ul 10× T4 DNA ligase buffer, 0.5 ulH₂O, 4 μl T7 adaptors (10 pmol each of the blunt end, 5′ N overhang, and3′ N overhang adaptors) and 1 μl T4 DNA Ligase (2,000 U). The reactionwas carried out for 1 hour at 16° C., the enzyme was inactivated at 65°C. for 10 minutes, and the samples were held at 4° C. until use.Alternatively, the libraries can be stored at −20° C. for extendedperiods prior to use.

Extension of the 3′ end to fill in the universal adaptor and subsequentamplification of the library were carried out under the same conditions.Five ng of library or H₂O (No DNA control) is added to a 25 μl reactioncomprising 25 pmol T7-C₁₀ (SEQ ID NO: 36) universal primer, 120 nmoldNTP, 1× PCR Buffer (Clontech), 1× Titanium Taq. Fluorescein calibrationdye (1:100,000) and SYBR Green I (1:100,000) were also added to allowmonitoring of the reaction using the I-Cycler Real-Time Detection System(Bio-Rad). The samples are initially heated to 75° C. for 15 minutes toallow extension of the 3′ end of the fragments to fill in the universaladaptor sequence and displace the short, blocked fragment of theuniversal adaptor. Subsequently, amplification is carried out by heatingthe samples to 95° C. for 3 minutes 30 seconds, followed by 18 cycles of94° C. for 15 seconds, and 65° C. for 2 minutes. The amplificationcurves for all 3 samples are depicted in FIG. 19A. The amplification ofthe sample without polishing was identical to the no DNA control,indicating that McrBC cleavage does not result in the production ofblunt ends. However, both Klenow and Klenow Exo- libraries amplifiedwith identical kinetics, indicating that although polishing is required,the DNA termini resulting from McrBC cleavage consist of 5′ overhangswith competent 3′ ends.

In order to further explore the possibility that the ends produced byMcrBC cleavage are directly competent for ligation without any furtherenzymatic repair, adaptors comprising universal T7 promoter sequence anddifferent numbers of random base 5′-overhangs were compared for theirligation efficiency in direct ligation reaction with genomic DNAdigested with McrBC. A sample of McrBC-digested DNA that was renderedblunt-ended with Klenow fragment of DNA polymerase I and ligated to ablunt end adaptor was used as a positive control to assess ligationefficiency.

One hundred nanograms of genomic DNA (Coriell repository # NA16028) wasdigested with McrBC nuclease in 10 μl of 1× NEBuffer 2 containing 100μg/ml BSA, 1 mM GTP, and 10 units of McrBC nuclease (NEB) at 37° C. for90 min, followed by incubation at 65° C. for 20 minutes to inactivatethe enzyme. An aliquot of 12.5 ng of digested DNA was blunt-ended withKlenow fragment of DNA polymerase I (USB) in 10 μl of 1× T4 Ligasebuffer (NEB) containing 2 nM of each dNTP at 25° C. for 15 min.

Adaptors comprising universal T7 promoter sequence with 5′ overhangscomprising from 0 to 6 completely random bases were assembled in 1× T4Ligase buffer (New England Biolabs) containing 15 μM T7GG (SEQ ID NO:32) and 30 μM T7SH (SEQ ID NO: 34) to form a blunt end adaptor; or 15 μMT7GG (SEQ ID NO: 32) and 30 μM of T7SH-2N, T7SH-3N, T7SH-4N, T7SH-5N, orT7SH-6N (SEQ ID: 55, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 58, andSEQ ID NO: 59, respectively) to form 5′ N overhang adaptors with 2, 3,4, 5, or 6 bases respectively (see Table I for oligonucleotidesequences). Adaptor mixtures were heated at 95° C. for 1 min, cooled toroom temperature and incubated for 5 min on ice prior to use.

T7 adaptors with 0, 2, 3, 4, 5 or 6 random base overhangs were thenligated to 12.5 ng (10 μl) aliquots of the McrBC digested DNA by adding0.5 ul of 10× T4 DNA ligase buffer, 0.5 μl H2O, 4μl T7 adaptors (15pmol), and 1 μl T4 DNA Ligase (2,000 U). The reactions were carried outfor 1 hour at 16° C. and the enzyme was inactivated at 65° C. for 10minutes. A control blunt-end ligation reaction with 12.5 ng of polishedDNA (see above) and blunt-end T7 adaptor (0 overhang) was run inparallel under the same conditions.

In the next step, extension of the 3′ ends to fill in the universaladaptors and subsequent amplification of the libraries was performed.Five nanograms of DNA from each sample (or ligation buffer used asnegative control) were added to 50 μl reactions comprising 1 μM T7universal primer (SEQ ID NO: 37), 200 μM of each dNTP, 4% DMSO, 1× PCRBuffer (Clontech), 1× Titanium Taq, fluorescein calibration dye(1:100,000), and SYBR Green I (1:100,000). The extension andamplification were carried out using I-Cycler Real-Time Detection System(Bio-Rad). The samples were initially heated to 72° C. for 15 minutes toallow extension of the 3′ end of the fragments to fill in the universaladaptor sequence and displace the short, blocked fragments of theuniversal adaptors. After denaturation at 95° C. for 3.5 minutes,library DNA was amplified for 23 cycles at 94° C. for 15 seconds, and65° C. for 2 minutes. The amplification curves for all 7 samples and thenegative control are depicted in FIG. 19B. The amplification ofnon-polished samples ligated to adaptors with 5 or 6 base overhangs wasvirtually identical to the control polished sample ligated to theblunt-end (0 overhang) adaptor, indicating that the 5′ overhangsproduced by McrBC cleavage are at least 6 bases long. Adaptor withoverhangs shorter than 5 bases were much less efficient. This resultindicates that a minimum of 5 bases are required to support efficienthybridization and subsequent ligation of adaptors to McrBC overhangsunder the conditions of the ligation reaction.

To determine the optimal amount of 5′ 6 base overhang T7 adaptor forefficient ligation to McrBC ends, 10 ng aliquots of McrBC-digested DNAwere incubated with 1000 units of T4 ligase (New England Biolabs) in 1×T4 ligase buffer with 0, 0.032, 0.064, 0.125, 0.25, 0.5, or 1μM finaladaptor concentration. Ligation was carried out at 16° C. for 1 hour ina final volume of 30 μl. Two nanogram aliquots of the ligation reactionswere amplified by real-time PCR following extension of the 3′ ends tofill in the universal adaptor. Six microliters of library DNA from eachligation reaction or H₂O (no DNA control) were added to a 75 μl reactioncomprising 1 μM T7 universal primer (SEQ ID NO: 37), 200 μM of eachdNTP, 4% DMSO, 1× PCR Buffer (Clontech), 1× Titanium Taq, Fluoresceincalibration dye (1:100,000), and SYBR Green I (1:100,000). The sampleswere initially heated to 72° C. for 15 minutes to fill in the universaladaptor sequence. Subsequently, amplification was carried out for 22cycles at 94° C. for 15 seconds and 65° C. for 2 min, followingdenaturation for 3.5 minutes at 95° C. using an I-Cycler Real-TimeDetection System (Bio-Rad). As shown in FIG. 20, adaptor concentrationsof 0.25 μM and above resulted in complete ligation under the conditionstested.

Example 10 Preparation of Short Libraries for Analysis of DNAMethylation by Microcon Size Fractionation from Mcrbc Cleaved DNA

This example describes the utility of libraries comprising short DNAsequences obtained by membrane microfiltration, which originate at McrBCcleavage sites and are rendered amplifiable by ligation of universaladaptor sequence, for the analysis of the methylation status of promoterCpG sites.

Aliquots of 50 ng or 10 ng of genomic DNA isolated from exemplary KG1-Aleukemia cells or control genomic DNA (Coriell repository # NA16028)were digested with McrBC nuclease in 10 μl of 1× NEBuffer 2 containing100 μg/ml BSA, 1 mM GTP, and 1 unit of McrBC nuclease (NEB) at 37° C.for 35 minutes, followed by 65° C. for 10 min to inactivate the enzyme.T7 adaptor with 6 random 5′-base overhang (T7-N6) consisting of T7GG andT7SH-6N (SEQ ID NO: 32 and SEQ ID NO: 59, respectively) (Table I) wasassembled as described in Example 9.

T7-N6 adaptor was ligated to the McrBC digested DNA samples in a finalvolume of 30 μl containing 1× T4 DNA ligase buffer, 1 μM T7 adaptor,2,500 U of T4 DNA Ligase (New England Biolabs), and the entire 10 μl ofthe McrBC digestion samples. Ligation reactions were carried out for 1hour at 16° C. and the enzyme was inactivated at 65° C. for 10 minutes.

The ligation reactions were next supplemented with 80 μl of filtrationbuffer containing 10 mM Tris-HCl, pH 7.5, 0.1 mM EDTA, and 100 mM NaCl,and the DNA was size fractionated by passing the samples throughMicrocon YM-100 filters (Millipore) at 500×g at ambient temperature.Under these buffer conditions the Microcon filters retain DNA fragmentsabove approximately 250 bp. The small fragments in the filtratefractions were concentrated by ethanol precipitation and reconstitutedin 15μl of TE-L buffer.

In the next step, the 3′ ends of the universal adaptor are filled in byextension and the libraries are amplified by PCR. The samples from theprevious step were supplemented with PCR reaction buffer comprising 1×Titanium Taq buffer (BD Clontech), 1μM T7 universal primer (SEQ ID NO:37), 200 μM of each dNTP, 4% DMSO, 1× Titanium Taq polymerase (BDClontech), fluorescein calibration dye (1:100,000) and SYBR Green I(1:100,000) in a final volume of 75 μl. Extension of the 3′ ends andsubsequent amplification were performed on an I-Cycler Real-TimeDetection System (Bio-Rad). After initial denaturation at 95° C. for 3.5minutes the samples were heated to 72° C. for 15 minutes and then cycledat 94° C. for 15 seconds, and 65° C. for 2 minutes until a plateau wasreached by the real-time amplification curves.

To quantify the short DNA fragments released by McrBC digestion from thep16 promoter CpG island, 5 ng of library material were used in PCRreaction with a primer pair specific for a short internal promoterregion. Amplification was carried in reaction mixture containing 1×Titanium Taq reaction buffer (Clontech), 200 μM of each dNTP, 4% DMSO,fluorescein calibration dye (1:100,000) and SYBR Green I (1:100,000),200 nM each forward and reverse primer specific for p16 promoter (SEQ IDNO: 61 forward and 62 reverse), and 5 units of Titanium Taq polymerase(Clontech) in a final volume of 50 μl at 95° C. for 3 minutes followedby 94° C. for 15 seconds and 68° C. for 1 minute until a plateau wasreached for the cancer samples by the real time amplification curves.

FIG. 21 shows the amplification of short sequence in the CpG island ofthe p16 promoter in normal and cancer cells from the libraries preparedby Microcon filtration. A difference of 7 cycles and 5 cycles betweencancer and normal cells was obtained for libraries prepared from 10 ngand 50 ng of genomic DNA respectively. This demonstrates that sufficientamounts of DNA can be generated for analysis of methylation in multiplepromoters as well as for discovery of unknown hypermethylated promotersfrom small amount of starting material.

Example 11 Titration of the Input Amount of Genomic DNA for Preparationof Short Libraries for Analysis of Methylation by Microcon SizeFractionation from Mcrbc Cleaved DNA

In this example, the effect of the amount of input DNA on preparation oflibraries described in Example 10 was studied. To increase thesensitivity of the assay, the libraries were first amplified followingligation of universal T7-N6 adaptor, subjected to size fractionation byMicrocon filters, and re-amplified.

Aliquots of 10 ng, 1 ng and 0.1 ng of genomic DNA isolated fromexemplary KG1-A leukemia cells or control genomic DNA (Coriellrepository # NA16028) were digested with McrBC nuclease in 10 μl of 1×NEBuffer 2 comprising 100 μg/ml BSA, 1 mM GTP, and 0.5 units of McrBCnuclease (or 1 unit of McrBC in the case of 10 ng input DNA) at 37° C.for 35 minutes, followed by incubation at 65° C. for 10 minutes toinactivate the enzyme.

Universal T7-N6 adaptor with 6 random base 5′ overhang was ligated tothe McrBC-digested DNA samples in a final volume of 30 μl containing 1×T4 DNA ligase buffer, 1μM T7-N6 adaptor, 2,500 U of T4 DNA Ligase (NewEngland Biolabs), and the entire 10 ul of the McrBC-digested samples.Ligation reactions were carried out for 1 hour at 16° C. and the enzymewas inactivated at 65° C. for 10 minutes.

Next, the 3′ ends of the universal adaptors were filled in by extensionand the libraries were amplified by PCR. The samples were supplementedwith PCR reaction buffer comprising 1× Titanium Taq buffer (BDClontech), 1 μM T7 universal primer (SEQ ID NO: 37), 200 μM of eachdNTP, 4% DMSO, 1× Titanium Taq polymerase (BD Clontech), fluoresceincalibration dye (1:100,000) and SYBR Green I (1:100,000) in a finalvolume of 75 Extension of the 3′ ends to fill in the universal adaptorsequence and subsequent amplification were performed on an I-CyclerReal-Time Detection System (Bio-Rad). After initial denaturation at 95°C. for 3.5 minutes the samples were heated to 72° C. for 15 minutes andthen cycled at 94° C. 15 seconds and 65° C. for 2 minutes until aplateau was reached by the real-time amplification curves. Samples wereprecipitated with ethanol and reconstituted in 15 μl of TE-L buffer.

The samples were then supplemented with 80 μl of filtration buffercontaining 10 mM Tris-HCl, pH 7.5, 0.1 mM EDTA, and 100 mM NaCl, and DNAwas size fractionated by passing through Microcon YM-100 filters(Millipore) for 10 to 12 minutes at 500×g at ambient temperature. Thefiltrate fractions were concentrated by ethanol precipitation andreconstituted in 15 μl of TE-L buffer. Five nanograms of each librarywere then used in re-amplification reaction with the T7 primer under theconditions described in the previous paragraph.

To quantify the short DNA fragments released by McrBC digestion from thep16 promoter CpG island, 5 ng of library material were used in PCRreaction with primer pair specific for a short internal promoter region.Amplification was carried out in a reaction mixture comprising 1×Titanium Taq reaction buffer (Clontech), 200 μM of each dNTP, 4% DMSO,fluorescein calibration dye (1:100,000) and SYBR Green I (1:100,000),200 nM each forward and reverse primer specific for p16 promoter (SEQ IDNO: 61 forward and SEQ ID NO: 62 reverse), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 50 μl at 95° C. for 3 minutesfollowed by 94° C. for 15 sec and 68° C. for 1 minute until a plateauwas reached for the cancer samples by the real time amplificationcurves. Aliquots of the amplified material were analyzed on 1% agarosegel stained with ethidium bromide.

FIG. 22 illustrates the amplification of libraries derived fromdifferent input amounts of DNA. As shown, the libraries prepared fromcancer and normal cells amplified with equal efficiencies. FIG. 23 showsamplification of short sequence in the CpG island of the p16 promoter innormal and cancer cells from libraries re-amplified after Microconfiltration. The insert to FIG. 23 represents gel analysis of the shortp16 amplicon. As little as 1 ng of input material proved adequate foranalysis of hypermethylation of the p16 promoter CpG island resulting inover 10 cycles difference between cancer and normal DNA. The presentinventors were unable to detect the specific p16 sequence in librariesprepared from 0.1 ng of DNA.

This example demonstrates that the sensitivity of the assay in thepresent invention is significantly higher than methods known in the artfor analysis of genome wide methylation of promoter CpG islands.

Example 12 Preparation of Short Libraries for Analysis of Methylation bySize-selective Amplification from Mcrbc Cleaved DNA

This example describes the preparation of libraries comprising short DNAsequences obtained by selective amplification of short fragments derivedby McrBC cleavage and rendered amplifiable by ligation of two differentuniversal adaptor sequences, for analysis of the methylation status ofpromoter CpG sites.

Aliquots of 10 ng of genomic DNA isolated from KG1-A leukemia cells orcontrol genomic DNA (Coriell repository # NA16028) were digested withMcrBC nuclease in 10 μl of 1× NEBuffer 2 containing 100 μg/ml BSA, 1 mMGTP, and 1 unit of McrBC nuclease (NEB) at 37° C. for 35 minutes,followed by incubation at 65° C. for 10 minutes to inactivate theenzyme. T7-N6 and GT-N6 adaptors with 6 random 5′-base overhangsconsisting of T7GG and T7SH-6N oligos (SEQ ID NO: 32 and SEQ ID NO: 59),and Ku and GTSH-6N oligos (SEQ ID NO: 15 and SEQ ID NO: 60),respectively (Table I) were assembled as described in Example 9.

T7-N6 and GT-N6 adaptors were ligated to the McrBC-digested DNA samplesin a reaction mixture containing 1× T4 DNA ligase buffer, 300 nM eachadaptor, 760 U of T4 DNA Ligase (New England Biolabs), and the entire 10μl of the McrBC digestion samples in a final volume of 30 μl. Ligationreactions were carried out for 1 hour at 16° C. and the enzyme wasinactivated at 65° C. for 10 minutes.

The 3′ ends of the universal adaptors were then filled in by extensionand the libraries were amplified by PCR with T7 (SEQ ID NO: 37) and Ku(SEQ ID: 15) primers using reduced extension time to allow only shortsequences receiving adaptors at both ends to be amplified. Five nanogramaliquots of the ligation reactions were supplemented with PCR reactionbuffer comprising 1× Titanium Taq buffer (BD Clontech), 250 nM each T7or GT primer (SEQ ID NO: 37 and SEQ ID NO: 15), 200 μM of each dNTP, 4%DMSO, 1× Titanium Taq polymerase (BD Clontech), fluorescein calibrationdye (1:100,000) and SYBR Green I (1:100,000) in a final volume of 75 μl.Extension of the 3′ ends to fill in the universal adaptor sequence andsubsequent amplification were performed on an I-Cycler Real-TimeDetection System (Bio-Rad) by incubating the reactions at 72° C. for 15minutes. After initial denaturation at 95° C. for 2.5 minutes thesamples were heated to 72° C. for 15 minutes and then cycled at 94° C.15 seconds, and 65° C. for 15 seconds until a plateau was reached by thereal-time amplification curves.

Aliquots of 4 ng or 20 ng of amplified library material were then usedto quantify, by PCR, the short DNA fragments released by McrBC digestionfrom the p16 promoter CpG island. Amplification was carried in reactionmixture containing 1× Titanium Taq reaction buffer (Clontech), 200 μM ofeach dNTP, 4% DMSO, fluorescein calibration dye (1:100,000) and SYBRGreen I (1:100,000), 200 nM each forward and reverse primer specific forp16 promoter (SEQ ID NO: 61 forward and SEQ ID NO: 62 reverse), and 5units of Titanium Taq polymerase (Clontech) in a final volume of 50 μlat 95° C. for 3 minutes followed by a various number of cycles of 94° C.for 15 seconds and 68° C. for 1 minute until a plateau was reached forthe cancer samples, as evidenced by the real time amplification curves.

FIG. 24 shows the amplification of short sequence in the CpG island ofp16 promoter in normal and cancer cells from 20 ng or 4 ng of libraryDNA. As shown, between 5 and 6 cycles difference could be detectedbetween methylated cancer DNA and unmethylated control DNA.

To establish the optimal concentration of McrBC for library preparation,the present inventors carried out titration of the enzyme in a range of0 to 10 units in digestion reaction comprising 10 ng of genomic DNA.McrBC digestion, ligation of universal adaptors T7-N6 and GT-N6,amplification of libraries, and analysis of p16 promoter sequence was asdescribed above. FIG. 25 depicts the result of the McrBC titrationexperiment. As shown, in contrast to non-methylated control DNA,increasing the amount of enzyme incubated with methylated (cancer) DNAresulted in a proportional increase in the amplification signal for theshort p16 promoter sequence. Due to the increased percentage ofglycerol, it was impractical to test amounts of McrBC enzyme above 10units per reaction. The results of the previous experiment usingdifferent ratios of enzyme to template DNA, combined with the presentresults, indicate that the level of McrBC degradation depends mostly onthe absolute amount of, or the concentration of, McrBC, and not on theratio of enzyme to DNA template (E. Kamberov personal observation).Thus, dimerization of McRBC plays a critical role in the process ofcleavage of methylated DNA.

Example 13 Utilization of the Methylation-sensitive Restriction EnzymeNot I and Whole Genome Amplification by Mechanical Fragmentation toCreate a Library of Methylated Restriction Sites

This example, illustrated in FIG. 26, describes the amplification ofmethylated genomic DNA sites from DNA that has been digested with themethylation-sensitive restriction enzyme Not I, amplified by wholegenome amplification relying on mechanical fragmentation, digested againwith Not I, and amplified to select only sites that were methylated inthe original intact DNA sample. A control library is also generated byomitting the first Not I digestion, which will result in all Not I sitesbeing amplified in the final product.

Aliquots of genomic DNA (2.5 μg) were digested overnight at 37° C. withNot I restriction enzyme (25 U) in the presence of 1× buffer H (NEB).The enzyme was heat inactivated at 65° C. for 10 minutes and then cooledto 4° C. The digested DNA was precipitated with pellet paint accordingto the manufacturer's instructions and quantified by optical density.

Aliquots of 110 μl of genomic and Not I-digested DNA preps comprising100 ng of DNA were heated to 65° C. for 2 minutes, vortexed for 15″ andincubated for an additional 2 minutes at 65° C. The samples were spun at12 min at ambient temperature at 16,000× G. One hundred μl of sample wastransferred to a new tube and subjected to mechanical fragmentation on aHydroShear device (Gene Machines) for 20 passes at a speed code of 3,following the manufacturer's protocol. The sheared DNA has an averagesize of 1.5 kb as predicted by the manufacturer and confirmed by gelelectrophoresis. To prevent carry-over contamination, the shearingassembly of the HydroShear was washed 3 times each with 0.2 M HCl, and0.2 M NaOH, and 5 times with TE-L buffer prior to and followingfragmentation. All solutions were 0.2 um filtered prior to use.

Fragmented DNA samples may be used immediately for library preparationor stored at −20° C. prior to use. The first step of this embodiment oflibrary preparation is to repair the 3′ end of all DNA fragments and toproduce blunt ends. This step comprises incubation with at least onepolymerase. Specifically, 11.5 μl 10× T4 DNA ligase buffer, 0.38 μl dNTP(33 μM FC), 0.46 μl Klenow (2.3 U, USB) and 2.66 μl H₂O were added tothe 100 ul of fragmented DNA. The reaction was carried out at 25° C. for15 minutes, and the polymerase was inactivated at 75° C. for 15 minutesand then chilled to 4° C.

Universal adaptors were ligated to the 5′ ends of the DNA using T4 DNAligase by addition of 4 ul T7 adaptors (10 pmol each of the blunt end,5′ N overhang, and 3′ N overhang adaptors, SEQ ID NO: 32 and SEQ ID NO:34, SEQ ID NO: 32 and SEQ ID NO: 35, SEQ ID NO: 33 and SEQ ID NO: 34)and 1 ul T4 DNA Ligase (2,000 U). The reaction was carried out for 1hour at 16° C., the enzyme was inactivated at 65° C. for 10′, and thesamples were held at 4° C. until use. Alternatively, the libraries canbe stored at −20° C. for extended periods prior to use.

Extension of the 3′ end to fill in the universal adaptor and subsequentamplification of the library were carried out under the same conditions.Five ng of library is added to a 25 μl reaction comprising 25 pmolT7-C₁₀ primer (SEQ ID NO: 36), 120 nmol dNTP, 1× PCR Buffer (Clontech),1× Titanium Taq. Fluorescein calibration dye (1:100,000) and SYBR GreenI (1:100,000) are also added to allow monitoring of the reaction usingthe I-Cycler Real-Time Detection System (Bio-Rad). The samples areinitially heated to 75° C. for 15′ to allow extension of the 3′ end ofthe fragments to fill in the universal adaptor sequence and displace theshort, blocked fragment of the universal adaptor. Subsequently,amplification is carried out by heating the samples to 95° C. for 3minutes 30 seconds, followed by 18 cycles of 94° C. 15 seconds, 65° C. 2minutes. Following amplification, the DNA samples were purified usingthe QIAQUICK® kit (Qiagen) and quantified by optical density.

Aliquots of genomic and Not I digested amplified DNA was digested by NotI restriction enzyme by incubating 1 to 2 μg DNA in 1× Buffer H and 10Units of Not I in a 30 ul reaction volume overnight at 37° C. The enzymewas heat inactivate at 65° C. for 10′ and then cooled to 4° C. Thedigested DNA was precipitated with pellet paint according to themanufacturer's instructions and quantified by optical density.

GT adaptors were ligated to the 5′ ends of DNA (50 ng) using T4 DNAligase by addition of 2μl GT adaptors (10 pmol, SEQ ID NO: 15 and SEQ IDNO: 54), 2μl 10× DNA ligase buffer and 1μl T4 DNA Ligase (2,000 U) in afinal volume of 20μl. The reaction was carried out for 1 h at 16° C. andthen held at 4° C. until use. Alternatively, the libraries can be storedat −20° C. for extended periods prior to use.

Extension of the 3′ end to fill in the GT adaptor and subsequentamplification of the library were carried out under the same conditions.Five ng of library is added to a 25 μl reaction comprising 25 pmol C₁₀universal primer (SEQ ID NO: 38), 25 pmol Ku primer (SEQ ID NO: 15), 120nmol dNTP, 1× PCR Buffer (Clontech), 1× Titanium Taq. Fluoresceincalibration dye (1:100,000) and SYBR Green I (1:100,000) are also addedto allow monitoring of the reaction using the I-Cycler Real-TimeDetection System (Bio-Rad). The samples are initially heated to 75° C.for 15′ to allow extension of the 3′ end of the fragments to fill in theuniversal adaptor sequence and displace the short, blocked fragment ofthe universal adaptor. Subsequently, amplification is carried out byheating the samples to 95° C. for 3 minutes 30 seconds, followed by 23cycles of 94° C. 15 seconds, 65° C. 2 minutes. Following amplification,the DNA samples were purified using the QIAQUICK® kit (Qiagen) andquantified by optical density.

The amplified DNA was analyzed using real-time, quantitative PCR using apanel of 14 human genomic markers adjacent to known Not I restrictionsites. The markers that make up the panel are listed in Table II.Quantitative Real-Time PCR was performed using an I-Cycler Real-TimeDetection System (Bio-Rad), as per the manufacturer's directions.Briefly, 25 μl reactions were amplified for 40 cycles at 94° C. for 15seconds and 68° C. for 1 minute. Standards corresponding to 10, 1, and0.2 ng of fragmented DNA were used for each marker. A standard curve wascreated for each marker and used for quantification of each sample(I-Cycler software, Bio-Rad). The resulting quantities were comparedbetween the genomic and Not I-digested samples to determine whether eachsite was methylated. FIG. 27 indicates that all 14 markers were detectedin the genomic control sample, indicating that all sites weresuccessfully digested and amplified. The Not I-digested DNA samplecomprised 7 of the 14 sites, indicating that half of the sites in thegenomic DNA were originally methylated.

Example 14 Utilization of the Methylation-sensitive Restriction EnzymeNot I and Whole Genome Amplification by Chemical Fragmentation to Createa Library of Methylated Restriction Sites

This example, illustrated in FIG. 26, describes the amplification ofmethylated genomic DNA sites from DNA that has been digested with themethylation-sensitive restriction enzyme Not I, amplified by wholegenome amplification relying on chemical fragmentation, digested againwith Not I, and amplified to select only sites that were methylated inthe original intact DNA sample. A control library is also generated byomitting the first Not I digestion, which will result in all Not I sitesbeing amplified in the final product.

Aliquots of genomic DNA (2.5 μg) were digested overnight at 37° C. withNot I restriction enzyme (25 U) in the presence of 1× buffer H. Theenzyme was heat inactivate at 65° C. for 10 minutes and then cooled to4° C. The digested DNA was precipitated with pellet paint according tothe manufacturer's instructions and quantified by optical density.

Aliquots of restriction endonuclease digested and control DNA (50 ng)were diluted in TE to a final volume of 10 μl. The DNA was subsequentlyheated to 95° C. for 4 minutes, and then cooled to 4° C. Two μl of 10×T4 DNA Ligase buffer was added to the DNA, and the mixture was heated to95° C. for 2 minutes and then cooled to 4° C.

In order to generate competent ends for ligation, 40 nmol dNTP(Clontech), 0.1 pmol phosphorylated random hexamer primers (Genelink),and 5 U Klenow (NEB) were added, and the resulting 15 μl reaction wasincubated at 37° C. for 30 minutes and 12° C. for 1 hour. Followingincubation, the reaction was heated to 65° C. for 10′ to destroy thepolymerase activity and then cooled to 4° C.

GT adaptors were ligated to the 5′ ends of the DNA using T4 DNA ligaseby addition of 4 μladaptors (10 pmol each of the blunt end, 5′ Noverhang, and 3′ N overhang adaptors, SEQ ID NO: 32 and SEQ ID NO: 34,SEQ ID NO: 32 and SEQ ID NO: 35, SEQ ID NO: 33 and SEQ ID NO: 34) and1μl T4 DNA Ligase (2,000 U). The reaction was carried out for 1 hour at16° C., the enzyme was inactivated at 65° C. for 10 minutes, and thesamples were held at 4° C. until use. Alternatively, the libraries canbe stored at −20° C. for extended periods prior to use.

Extension of the 3′ end to fill in the universal adaptor and subsequentamplification of the library were carried out under the same conditions.Five ng of library is added to a 25 μl reaction comprising 25 pmolT7-C₁₀ primer (SEQ ID NO: 36), 120 nmol dNTP, 1× PCR Buffer (Clontech),1× Titanium Taq. Fluorescein calibration dye (1:100,000) and SYBR GreenI (1:100,000) are also added to allow monitoring of the reaction usingthe I-Cycler Real-Time Detection System (Bio-Rad). The samples areinitially heated to 75° C. for 15 minutes to allow extension of the 3′end of the fragments to fill in the universal adaptor sequence anddisplace the short blocked fragment of the universal adaptor.Subsequently, amplification is carried out by heating the samples to 95°C. for 3 minutes 30 seconds, followed by 18 cycles of 94° C. 15 seconds,65° C. 2 minutes. Following amplification, the DNA samples were purifiedusing the QIAQUICK® kit (Qiagen) and quantified by optical density.

Aliquots of genomic and Not I-digested amplified DNA was digested by NotI restriction enzyme by incubating 1-2 μg DNA in 1× Buffer H and 10Units of Not 1 in a 30 μl reaction volume overnight at 37° C. The enzymewas heat inactivated at 65° C. for 10 minutes and then cooled to 4° C.The digested DNA was precipitated with pellet paint according to themanufacturer's instructions and quantified optical density.

GT adaptors were ligated to the 5′ ends of DNA (50 ng) using T4 DNAligase by addition of 2μl of GT adaptor (10 pmol, SEQ ID NO: 15 and SEQID NO: 54), 2μl 10× DNA ligase buffer and 1μl T4 DNA Ligase (2,000 U) ina final volume of 20 μl. The reaction was carried out for 1 hour at 16°C. and then held at 4° C. until use. Alternatively, the libraries can bestored at −20° C. for extended periods prior to use.

Extension of the 3′ end to fill in the GT universal adaptor andsubsequent amplification of the library were carried out under the sameconditions. Five ng of library is added to a 25 μl reaction comprising25 pmol C₁₀ primer (SEQ ID NO: 38), 25 pmol Ku primer (SEQ ID NO: 15),120 nmol dNTP, 1× PCR Buffer (Clontech), 1× Titanium Taq. Fluoresceincalibration dye (1:100,000) and SYBR Green I (1:100,000) are also addedto allow monitoring of the reaction using the I-Cycler Real-TimeDetection System (Bio-Rad). The samples are initially heated to 75° C.for 15′ to allow extension of the 3′ end of the fragments to fill in theuniversal adaptor sequence and displace the short, blocked fragment ofthe universal adaptor. Subsequently, amplification is carried out byheating the samples to 95° C. for 3 minutes 30 seconds, followed by 23cycles of 94° C. 15 seconds, 65° C. 2 minutes. Following amplification,the DNA samples were purified using the QIAQUICK® kit (Qiagen) andquantified optical density.

The amplified DNA was analyzed using real-time, quantitative PCR using apanel of 6 exemplary human genomic markers adjacent to known Not Irestriction sites. The markers that make up the panel are listed inTable II. Quantitative Real-Time PCR was performed using an I-CyclerReal-Time Detection System (Bio-Rad), as per the manufacturer'sdirections. Briefly, 25 ul reactions were amplified for 40 cycles at 94°C. for 15 seconds and 68° C. for 1 minute. Standards corresponding to10, 1, and 0.2 ng of fragmented DNA were used for each marker. Astandard curve was created for each marker and used for quantificationof each sample (I-Cycler software, Bio-Rad). The resulting quantitieswere compared between the genomic and Not I-digested samples todetermine whether each site was methylated. FIG. 28 indicates that all 6markers were detected in the genomic control sample, indicating that allsites were successfully digested and amplified. The Not I-digested DNAsample contained 3 of the 6 sites, indicating that half of the sites inthe genomic DNA were originally methylated.

Example 15 Utilization of the Methylation-specific Enzyme Mcrbc and SubGenome Amplification to Detect Regions of Hypomethylation

One important aspect of progression of many cancers and diseases is thehypomethylation of certain regions of DNA leading to the over-expressionof tumor promoters. It is important to be able to detect areas wheremethylation inhibition has been lost in order to understand cancer anddisease progression, and to develop diagnostic tools for theidentification of this progression as well as treatment options forthese patients, for example. FIG. 29A depicts a method for creating andamplifying libraries that are specific for hypomethylation. Test andcontrol DNA samples are digested with McrBC to generate cleavage ofhypermethylated regions. Following cleavage, random fragmentation isperformed by chemical or mechanical means and libraries are created aspreviously described in Examples 13 and 14. The resulting amplicons fromhypomethylated DNA regions are amplified and the resulting amplificationproducts can be analyzed by PCR to detect specific sequences of interestor by hybridization to large numbers of sequences, for instance on amicroarray, for discovery or diagnostic purposes, for example.

In an alternative embodiment (FIG. 29B), an additional step involvingthe polishing of the ends of the DNA following McrBC cleavage andligation of adaptors comprising a Poly C sequence (10-40 bp) isperformed. A universal adaptor sequence is ligated during librarypreparation following random fragmentation. This step allows theblockage of amplification of DNA fragments from hypermethylated regionsthat comprise the Poly C adaptor at both ends of the amplicons.

Example 16 Utilization of Library Generation by MechanicalFragmentition, the Methylation-specific Enzyme Mcrbc, and Sub GenomeAmplification to Detect Regions of Hypomethylation

A second method for the preparation of hypomethylation-specificlibraries involves the use of McrBC to cleave library amplicons that aremethylated (FIG. 30). In a specific embodiment, DNA is fragmentedmechanically and libraries are created by polishing the ends andattaching universal adaptors. Following library preparation, methylatedamplicons is digested with the methylation-specific restrictionendonuclease McrBC. This digestion cleaves all library molecules thatcontain 2 or more methylated cytosines, and this digestion will resultin the loss of the ability to amplify these amplicons. Amplification ofthe remaining molecules will result in selection of only those ampliconsthat are hypomethylated. The resulting amplification products can beanalyzed by PCR to detect specific sequences of interest or byhybridization to large numbers of sequences, for instance on amicroarray, for discovery or diagnostic purposes.

Example 17 Utilization of Library Preparation by Chemical Fragmentition,the Methylation-specific Enzyme Mcrbc, and Sub Genome Amplification toDetect Regions of Hypomethylation

A third method for the generation of hypomethylation-specific librariesinvolves library preparation following chemical fragmentation anddigestion with McrBC followed by a single cycle of PCR. In a specificembodiment, DNA is fragmented chemically and libraries are created by afill-in reaction, polishing of the resulting ends, and attachinguniversal adaptors. One cycle of PCR is performed with either amethylated or non-methylated primer to create a double stranded intactmolecule. It is unclear at this time whether McrBC requires 2 methylgroups on opposite strands (trans), or if 2 methyl groups on the samestrand (cis) are capable of inducing cleavage. If methyl groups arerequired to be in trans, then a methylated oligo will be used for the 1cycle PCR reaction. However, a non-methylated oligo is used if the cisorientation is sufficient for McrBC-induced cleavage. Following librarypreparation, methylated amplicons are digested with themethylation-specific restriction endonuclease McrBC. This digestion willcleave all library molecules that contain either 1 (trans) or more than2 (cis) methylated cytosines and results in the loss of the ability toamplify these amplicons. Amplification of the remaining moleculesresults in selection of only those amplicons that are hypomethylated.The resulting amplification products can be analyzed by PCR to detectspecific sequences of interest or by hybridization to large numbers ofsequences, for instance on a microarray, for discovery or diagnosticpurposes. A figure depicting use of a methylated oligo for the singlecycle PCR reaction is illustrated in FIG. 31.

Example 18 Detection of DNA Methylation in Cancer Cells UsingMethylation-sensitive Restriction Endonucleases and Whole GenomeAmplification (WGA)

This example describes a method for the preparation of libraries whereonly methylated promoters are present in the amplified material. Anoutline of this procedure is depicted in FIGS. 33A, 33B, and 33C. DNAcomprising a promoter CpG island is digested with amethylation-sensitive restriction endonuclease (FIG. 33A) or a mixtureof several (5 or more) methylation-sensitive restriction endonucleasessuch as Aci I, Bst UI, Hha I, HinP1, Hpa II, Hpy 991, Ava I, Bce AI, BsaHI, Bsi E1, and Hga I (FIGS. 33B and 33C). The spatial distribution ofrecognition sites for these nucleases in the human genome closely mimicsthe distribution of the CpG dinucleotides. Their density is very high inthe CpG-rich promoter regions (FIGS. 33D and 33E) and some otherCpG-rich regions (CpG islands) with unknown function.

The non-methylated CpG-rich regions, such as gene promoters in normalcells, are digested into small pieces, while the methylated CpG-richregions, such as some gene promoters in cancer cells, is maintainedintact. Following digestion, the DNA is converted into a library andamplified using the random priming strand displacement method describedin U.S. patent application Ser. No. 10/795,667, filed Mar. 8, 2004, nowU.S. Pat. No. 7,718,403. The promoter region is present within theamplified material only if it was methylated and protected fromcleavage. The small fragments produced by digestion of a non-methylatedpromoter region are too small to serve as suitable template in the wholegenome amplification protocol and are not amplified. Analysis of theproducts of amplification by PCR, microarray hybridization, probehybridization and/or probe amplification will allow the determination ofwhether specific regions are methylated. Thus, a determination of thestate of methylation of a specific promoter region can be determined bycomparing a test sample, a negative control sample that is unmethylated,and a positive control sample that is heavily methylated.

There are several potential ways for assaying the amplified material formethylation status and a couple of these are depicted in FIGS. 34 and35. A high throughput quantitative PCR method is illustrated in FIG. 34.Briefly, amplified material from control and test samples are eachplaced into 48 wells of a 96 well plate containing primer pairs for 48specific promoter regions. Quantitative real-time PCR is performed, andthe difference in the number of amplification cycles is indicative ofmethylation in the test sample. FIG. 35 illustrates how control and testsamples can be hybridized to a microarray comprising promoter regions ofinterest. The control and test samples can be compared directly using atwo color system. Control samples should have few or no spots, allowingthe methylation status of the test sample to be determined based on thestrength of the signal.

Example 19 Digestion of Genomic DNA with Methylation-sensitiveRestriction Enzymes Containing CpG Dinucleotide in their Four-baseRecognition Site

This example describes the analysis of the average size of DNA fragmentsobtained after overnight digestion of genomic DNA withmethylation-sensitive restriction enzymes with recognition sitescomprising the CpG dinucleotide and no adenine or thymine.

Aliquots of 300 ng of pooled genomic DNA isolated by standard proceduresfrom the peripheral blood of 20 healthy male donors were digested for 15hours with Aci I, BstUI, HinP1 I, or Hpa II restriction endonucleases(New England Biolabs). Digestion reactions were carried out in 20 μlvolumes containing 1× of the respective optimal reaction buffer for eachenzyme (New England Biolabs), 300 ng of genomic DNA, and 10 units ofrestriction enzyme, for 15 hours at 37° C., or in the case of BstU I for15 hours at 60° C. A blank control containing no restriction enzyme wasalso incubated for 15 hours at 60° C.

FIG. 36 shows 165 ng aliquots of the digestion reactions analyzed on a1% agarose gel after staining with SYBR Gold (Molecular Probes). Asshown, even after overnight digestion the majority of the gDNA is stillin the compression zone above 12 Kb. In several follow-up experiments,almost complete cleavage at CpG sites was demonstrated for all fourenzymes, as is evident by the loss of amplification by primers flankingone or more CpG sites at different promoter regions after only 1 to 2hours of cleavage (see Examples below). These results demonstrate thatcleavage by restriction enzymes with four-base recognition sites that donot contain A or T is strongly biased due to (i) depletion of the CpGdinucleotide in the human genome, and (ii) methylation of non-island CpGsites known to be located mostly in repetitive DNA sequences.

Example 20 Methylation Analysis of P15, P16, and E-cadherin Promotersusing Libraries Prepared by Bstu I Digestion

This example demonstrates the utility of libraries prepared from DNAdigested with BstU I restriction enzyme by incorporating universalsequence using primers comprising the universal sequence at their 5′-endand a degenerate non-self-complementary sequence at their 3′-end in thepresence of DNA polymerase with strand-displacement activity for theanalysis of the methylation status of the exemplary promoter regions ofp15, p16, and E-Cadherin genes.

Genomic DNA was isolated by standard procedures from the exemplary KG1-Aleukemia cell line or from the peripheral blood of a pool of 20 healthymale donors. Digestion reactions were carried out in 50 μl volumecontaining 1× NEBuffer 2 (NEB), 50 ng DNA, and 10 units of BstU I (NEB),for 1 hour and 15 minutes at 60° C. Blank controls containing norestriction enzyme were also run in parallel. The DNA was precipitatedwith ethanol in the presence of 1.5 M ammonium acetate, washed with 75%ethanol, air dried, and resuspended in 50 μl of TE-L buffer.

Aliquots of 10 ng of each digested or non-digested DNA sample wererandomly fragmented in TE-L buffer by heating at 95° C. for 4 minutesand subjecting them to the library synthesis protocol. The reactionmixtures contained 10 ng of fragmented DNA in 1× EcoPol buffer (NEB),200 μM of each dNTP, 200 μM of 7-deaza-dGTP (Sigma), 4% DMSO, 360 ng ofSingle Stranded DNA Binding Protein (USB), and 1 μM of K(N)₂ primer (SEQID NO: 14) in a final volume of 14 After denaturing for 2 minutes at 95°C., the samples were cooled to 24° C., and the reaction was initiated byadding 5 units of Klenow Exo-DNA polymerase (NEB). Samples wereincubated at 24° C. for 1 hour. Reactions were then stopped by heatingfor 5 minutes at 75° C. The samples were further amplified byquantitative real-time PCR by transferring the entire reaction mixtureof the library synthesis into a PCR reaction mixture containing finalconcentration of : 1× Titanium Taq reaction buffer (Clontech), 200 μM ofeach dNTP, 200 μM of 7-deaza-dGTP (Sigma) or 0.5 M betaine (See BRIEFDESCRIPTION OF THE DRAWINGS, FIGS. 37A, 37B, and 37C), 4% DMSO,1:100,000 dilutions of fluorescein calibration dye and SYBR Green I(Molecular Probes), and 5 units of Titanium Taq polymerase (Clontech) ina final volume of 75 ul. Amplifications were carried out for 15 cyclesat 94° C. for 15 sec and 65° C. for 2 min on I-Cycler real-time PCRinstrument (Bio-Rad). Amplified libraries were purified using theQIAQUICK® kit (Qiagen) and quantified by optical density.

Next, the presence of amplifiable promoter sequences containing one ormore CpG sites as part of the BstU I recognition site in the amplifiedlibraries was analyzed by quantitative real-time PCR using specificprimers flanking such sites. The primer pairs were used as follows: p15promoter-Primer pair #1-p15 SF upstream (SEQ ID NO: 63) and p15 SBdownstream (SEQ ID NO: 64) amplifying a 73 bp fragment with 4 BstU Irestriction sites, Primer pair #2-p15 Neg F upstream (SEQ ID NO: 24),and p15 Neg B downstream (SEQ ID NO: 25) amplifying a 595 bp fragmentwith 5 BstU I restriction sites; p16 promoter-Primer pair #1-p16 Nick Fupstream (SEQ ID NO: 48) and p16 Nick B downstream (SEQ ID NO: 49)amplifying a 211 bp fragment with 1 BstU I restriction site, Primer pair#2-p16 LF upstream (SEQ ID NO: 65), and p16 LB downstream (SEQ ID NO:66) amplifying a 399 bp fragment with 3 BstU I restriction sites;E-Cadherin promoter-Primer pair #1-E-Cad Neg F upstream (SEQ ID NO: 28)and E-Cad Neg B downstream (SEQ ID NO: 29) amplifying a 223 bp fragmentwith 2 BstU I restriction sites, Primer pair #2 -E-Cad Neg F upstream(SEQ ID NO: 28), and E-Cad LB downstream (SEQ ID NO: 67) amplifying a336 bp fragment with 2 BstU I restriction sites. Aliquots of 20 ng ofamplified library material were used in reaction mixtures containing 1×Titanium Taq reaction buffer (Clontech), 200 μM of each dNTP, 4% DMSO,0.5 M betaine (Sigma), fluorescein calibration dye (1:100,000) and SYBRGreen I (1:100,000), 200 nM each forward and reverse primer, and 5 unitsof Titanium Taq polymerase (Clontech) in a final volume of 30 μl at 95°C. for 2 minute followed by 50 cycles at 94° C. for 20 seconds, and 68°C. for 1 minute.

FIGS. 37A, 37B, and 37C show the amplification of promoter sequencesfrom the CpG islands of p15, p16, and E-Cadherin promoters in normal andcancer cells from 20 ng of library DNA. For both primer pairs, in allthree promoter sites tested, a shift of between 7 and over 20 cycles wasobserved between libraries prepared from digested versus non-digestednon-methylated control DNA. On the other hand, for all primer pairsexcept one (p16 promoter, primer pair # 2, FIG. 37B), there was nodifference between libraries made from digested and non-digested cancerDNA. However, as compared to non-methylated DNA this difference was atleast an order of magnitude (more than 10 cycles) smaller. The reasonfor the shift in the cancer DNA sample is not clear, but in a specificembodiment it is due to methylation pattern heterogeneity of the cancercell line as a result of delayed methylation in actively replicatingnon-synchronous cell population. The background amplification fromdigested control (non-methylated) DNA in some primer sets is due toprimer-dimer formation as verified by agarose gel analysis but in othercases corresponds to the expected amplicon that can be attributed toincomplete restriction digestion. Overall, in all three promoters thedifference between methylated and non-methylated DNA is more thansufficient to clearly distinguish methylated from non-methylatedsequences.

This example demonstrates that, as predicted, during the process oflibrary preparation and subsequent amplification only those DNAmolecules that are protected by methylation will amplify, whereas DNAmolecules that are non-methylated will be digested into small fragments,will not be efficiently primed, and thus will not be present in thelibrary. Therefore, the presence of a specific site in the finalamplified product will indicate that the CpG comprised in themethylation-sensitive restriction site was methylated in the originalDNA molecule. In addition to real-time PCR, analysis of the presence ofspecific methylated sites can be done by LCR, ligation-mediated PCR,probe hybridization, probe amplification, microarray hybridization, anysuitable method in the art, or a combination thereof, for example.

Example 21 Methylation Analysis of Gstp-1 Promoter using LibrariesPrepared by Aci I or Bstu I Digestion

This example demonstrates the utility of libraries prepared from DNAdigested with Aci I restriction enzyme for the analysis of themethylation status of the exemplary promoter region of the GSTP-1 genein prostate cancer cell line and clinical samples from patients havingprostate adenocarcinoma.

Genomic DNA was isolated by standard procedures from the exemplary RWPEprostate cancer cell line, or from 3 clinical isolates of prostateadenocarcinoma. Digestion reactions were carried out in 50 μl volumecontaining 1× NEBuffer 3 (NEB), 50 ng DNA, and 10 units of Aci I (NEB),for 4 hours at 37° C. Blank controls containing no restriction enzymewere also run in parallel. The DNA was precipitated with ethanol in thepresence of 1.5 M ammonium acetate, washed with 75% ethanol, air dried,and resuspended in 20 μl of TE-L buffer.

Aliquots of 25 ng of each digested or non-digested DNA sample wererandomly fragmented in TE-L buffer by heating at 95° C. for 4 minutesand subjected to library preparation protocol. The reaction mixturescomprised 25 ng of fragmented DNA in 1× EcoPol buffer (NEB), 200 μM ofeach dNTP, 200 μM of 7-deaza-dGTP (Sigma), 4% DMSO, 360 ng of SingleStranded DNA Binding Protein (USB), and 1 μM of K(N)₂ primer (SEQ ID NO:14) in a final volume of 14 μl. After denaturing for 2 minutes at 95°C., the samples were cooled to 24° C., and the reaction was initiated byadding 5 units of Klenow Exo- DNA polymerase (NEB). Samples wereincubated at 24° C. for 1 hour. Reactions were then stopped by heatingfor 5 minutes at 75° C. Aliquots representing 10 ng of genomic DNA werefurther amplified by quantitative real-time PCR in a reaction mixturescontaining final concentration of: 1× Titanium Taq reaction buffer(Clontech), 200 μM of each dNTP, 200 μM of 7-deaza-dGTP, 4% DMSO,1:100,000 × dilutions of fluorescein calibration dye and SYBR Green I(Molecular Probes), and 5 units of Titanium Taq polymerase (Clontech) ina final volume of 75 ul. Amplifications were carried out for 15 cyclesat 94° C. for 15 seconds, and 65° C. for 2 minutes on an I-Cyclerreal-time PCR instrument (Bio-Rad). Amplified libraries were purifiedusing the QIAQUICK® kit (Qiagen) and quantified by optical density.

Next, the presence of a specific but exemplary GSTP-1 promoter sequencecomprising two CpG sites as part of Aci I recognition site was analyzedin the amplified libraries by quantitative real-time PCR using specificprimers flanking the CpG sites. The primers were GSTP-1 Neg F upstream(SEQ ID NO: 30) and GSTP1 Neg B2 downstream (SEQ ID NO: 68), amplifyinga 200 bp promoter region. Aliquots of 20 ng of amplified librarymaterial were used in reaction mixtures comprising 1× Titanium Taqreaction buffer (Clontech), 200 μM of each dNTP, 4% DMSO, 0.5 M betaine(Sigma), fluorescein calibration dye (1:100,000) and SYBR Green I(1:100,000), 200 nM each forward and reverse primer, and 5 units ofTitanium Taq polymerase (Clontech) in a final volume of 30 μl at 95° C.for 2 minutes followed by 50 cycles at 94° C. for 15 seconds and 68° C.for 1 minute.

FIG. 38 shows the real-time PCR methylation analysis of the studiedGSTP-1 promoter region in prostate samples. Two of the clinical samplesshowed complete methylation of the GSTP-1 promoter site, as evident bythe virtually identical amplification curves from libraries of AciI-digested and undigested DNA. The third clinical sample had a shift ofabout 4 cycles that in a specific embodiment the present inventorsattribute to contamination with non-malignant cells. On the other hand,the RPWE prostate cell was completely unmethylated for this promoterregion as evidenced by a shift of over 12 cycles (>4,000 folddifference) between digested and undigested DNA. A similar differencewas found in a separate experiment for libraries prepared from controlunmethylated DNA from the peripheral blood of healthy donors (resultsnot shown).

Example 22 CReation of a Secondary Methylome Library Enriched inMethylated Promoter Regions by Cleavage of a Primary Methylome Librarywith Restriction Endonucleases and Ligation of Multiple Adaptors

A method for the generation of secondary methylome libraries that areenriched in methylated promoter regions involves restrictionendonuclease cleavage of the amplification products from primarymethylome libraries followed by ligation of multiple adaptors andamplification of the resulting products. This method is illustrated inFIGS. 43A and 43B. Following amplification of a primary Methylomelibrary, all methylation-sensitive restriction endonuclease sites thatwere methylated in the original DNA are converted to unmethylated DNA inthe amplified products. These sites can be subsequently cleaved with thesame enzyme used during library creation (FIG. 43A). When a primaryMethylome library is prepared by using a mixture of several (5 or more)methylation-sensitive restriction enzymes, the secondary library can beprepared by mixing components together, ligating adaptors, andamplifying the products of several individual restriction digests of theprimary Methylome library using the same restriction endonucleases thathave been utilized in the nuclease cocktail (FIG. 43B).

Ligation of two or more adaptors comprising overhangs complementary tothe resulting cleavage fragments can be ligated with high efficiency.Subsequent amplification of the ligation products results inamplification of only fragments of DNA between two methylated cleavagesites. These molecules can be analyzed by microarray hybridization, PCRanalysis, probe amplification, probe hybridization, or other methodsknown in the art in order to determine the methylation status of theoriginal DNA molecule (Example 18, FIG. 34 and FIG. 35), for example.Sequencing of these products can provide a tool for discovering regionsof methylation not previously characterized, as no a priori knowledge ofthe sequences is required and the reduced complexity of the enrichedsecondary library allows analysis of a small number of methylatedregions.

In a particular embodiment, a one-step library preparation processutilizing a dU-Hairpin Adaptor method described in Example 33, 38, and39 can be used for preparation of secondary Methylome libraries. In thiscase, two hairpin oligonucleotides with different sequence should beused to avoid the PCR suppression effect that is known to inhibitamplification of very short DNA amplicons with one universal sequence atthe end.

Example 23 Analysis of Secondary Methylome Libraries by CapillaryElectrophoresis

A method for the analysis of secondary methylome libraries is based onthe reduced complexity of these libraries and involves the utilizationof capillary electrophoresis. This method is illustrated in FIG. 44. Dueto the fact that methylation-sensitive restriction endonucleases aremostly localized in CpG islands, the number of these sites in the genomeis significantly lower than would be expected statistically. Thus, thecomplexity of the secondary methylome library is dependent on the numberof methylated CpG islands present in the genome and the number of Hpa IIrestriction fragments present within these CpG islands. The number ofrestriction fragments in the secondary methylome library can becalculated by the formula N=n×(m−1), where n is the number of methylatedCpG islands and m is the average number of restriction sites per CpGisland. For example, if 1% of the 30,000 CpG islands in the genome aremethylated in a particular sample, and there is an average of 5 Hpa IIsites per CpG island, then there would be 1,200 restriction fragmentscontained in the secondary methylome library. If the amplification ofthe secondary methylome library is performed with the 16 combinations of4 possible A and B oligos containing a single selecting 3′-nucleotide,then each amplification would contain 75 fragments. These 75 fragmentscan be resolved by capillary electrophoresis. Further simplificationcould be achieved by using 64 amplifications, wherein one of the oligoscontains two selecting 3′-nucleotides instead of one, resulting in 19fragments per amplification. This analysis technique allows agenome-wide screening of CpG Islands for methylation status without thedevelopment of specific tests for each CpG Island contained in thegenome. Sequencing of specific fragments produced within eachamplification reaction will result in the identification of importantregions of methylation without a priori knowledge of the importance ofthose regions.

Example 24 Creation of Methylation Specific Libraries from Serum, Plasmaand Urine DNA by Cleavage with Methylation-sensitive RestrictionEndonucleases

This method describes how a primary methylome library can be createdfrom serum, plasma and urine DNA. An outline of this method isillustrated in FIG. 45. DNA isolated from serum, plasma and urine can beconverted into an amplifiable library by ligation of adaptors (U.S.patent application Ser. No. 10/797,333, filed Mar. 8, 2004, published asU.S. Patent Application Publication No.: 2004/0209299 and is nowabandoned). The molecules in this library range in size from 200 bp upto 1 to 2 kb, which are readily amplified by PCR. Furthermore, ligationof the adaptors does not result in any changes in the methylationpattern of the original DNA. Thus, the library molecules can be digestedwith a methylation sensitive restriction endonuclease (FIG. 45A) or amixture of several (5 and more) methylation sensitive restrictionendonucleases (FIGS. 45B and 45C). Any sites and groups of sites thatare methylated, for example, hypermethylated gene promoter regions incancer cells, will not be cleaved (FIG. 45C). Restriction site clustersthat are usually non-methylated in the gene promoter regions of normalcells are cleaved at multiple sites such that the correspondingamplicons are not amplified (FIG. 45B). Amplification of the resultinglibrary using PCR and universal primer will result in products thateither comprise a methylated restriction site or a group of sites, orlack a restriction site. The resulting molecules can be analyzed by PCR,microarray hybridization, probe hybridization, probe amplification, orother methods known in the art, for example. Only those sites that weremethylated in the original starting material are detected in theamplified library.

Example 25 Creation of Secondary Methylation Specific Libraries byCleavage with the Same Methylation-sensitive Restriction Endonucleasesand Ligation of Additional Adaptors

This example describes a method of generating a secondary methylomelibrary from serum, plasma and urine DNA that comprises only thosesequences adjacent to methylated restriction sites. Because methylatedCpG islands usually have the largest concentration of such sites, theywould be a major source for the secondary Methylome library amplicons.This library will not contain any fragments present in the amplifiedproducts from the primary library in Example 24 that lack therestriction site. An outline of this example is depicted in FIG. 46.

The primary library is created and amplified as in example 24 using PCRand a universal primer, and in a special case with the T7-C₁₀ primer(SEQ ID NO: 36). This amplification results in the loss of methylationpatterns from the original DNA. The previously methylated restrictionsites are now susceptible to cleavage by the restriction endonuclease.Following digestion with the same restriction endonuclease, one or moreadaptors can be ligated to the resulting fragments. When primaryMethylome library is prepared by using a mixture of several (5 or more)methylation-sensitive restriction enzymes, the secondary library can beprepared by mixing together components, ligating adaptors, andamplifying the products of several individual restriction digests of theprimary Methylome library using the same restriction endonucleases thathave been utilized in the nuclease cocktail (FIG. 43B).

Ligation of two or more adaptors comprising overhangs complementary tothe resulting cleavage fragments can be ligated with high efficiency.Subsequent amplification of the ligation products results inamplification of only fragments of DNA between two methylated cleavagesites. These molecules can be analyzed by microarray hybridization, PCRanalysis, probe amplification, probe hybridization, or other methodsknown in the art in order to determine the methylation status of theoriginal DNA molecule (Example 18, FIG. 34 and FIG. 35). Sequencing ofthese products can provide a tool for discovering regions of methylationnot previously characterized, as no a priori knowledge of the sequencesis required and the reduced complexity of the enriched secondary libraryallows analysis of a small number of methylated regions.

PCR amplification of this secondary methylome library with oligos basedon these adaptors and the C₁₀ primer (SEQ ID NO: 38) will lead toamplification of only those molecules that comprised a restrictionendonuclease site that was methylated in the original material. The C₁₀primer (SEQ ID NO: 38) has previously been demonstrated to inhibitamplification of molecules that contain this sequence at both ends (U.S.patent application Ser. No. 10/293,048, filed Nov. 13, 2002, now U.S.Pat. No. 7,655,791; U.S. patent application Ser. No. 10/795,667, nowU.S. Pat. No. 7,718,403, filed Mar. 8, 2004; and U.S. patent applicationSer. No. 10/797,333, filed Mar. 8, 2004, published as U.S. PatentApplication Publication No.: 2004/0209299 and is now abandoned). The useof a single adaptor during preparation of the secondary library willresult in amplification of only those sequences between the originaladaptor and the first cut within the amplicon. Ligation of multipleadaptors will also allow the amplification of any fragments produced bymultiple cleavage events in the same amplimer that are not expressed dueto suppression by ligation of a single adaptor to both ends.

Example 26 Creation of Methylationspecific Libraries from Serum andPlasma DNA Libraries by Cleavage with the Methylation SpecificEndonuclease McrBC

This example describes a method for amplifying methylated CpG sites fromDNA isolated from plasma and serum and is illustrated in FIG. 47. DNAisolated from serum and plasma can be converted into an amplifiablelibrary by ligation of poly-C containing adaptors (U.S. patentapplication Ser. No. 10/797,333, filed Mar. 8, 2004, published as U.S.Patent Application Publication No.: 2004/0209299 and is now abandoned).The molecules in this library range in size from 200 bp up to 1 to 2 kb,which are readily amplified by PCR. Furthermore, ligation of theadaptors does not result in any changes in the methylation pattern ofthe original DNA. The resulting library molecules can be digested withthe methylation-specific endonuclease McrBC. Any molecules that comprisetwo or more methylated CpG sites that are more than 30 bp apart will becleaved between the two sites. A second adaptor can be ligated to theends resulting from McrBC cleavage. The resulting products can beamplified using the second adaptor and the poly-C primer attached duringligation. Any products that do not have the second adaptor will besuppressed by the presence of poly-C sequence at each end (U.S. patentapplication Ser. No. 10/293,048, filed Nov. 12, 2002, now U.S. Pat. No.7,655,791; U.S. patent application Ser. No. 10/795,667, filed Mar. 8,2004; now U.S. Pat. No. 7,718,403 and U.S. patent application Ser. No.10/797,333, filed Mar. 8, 2004, published as U.S. Patent ApplicationPublication No.: 2004/0209299 and is now abandoned). The only productsthat will be amplified will be those comprising either a combination ofthe poly-C sequence and the second adaptor (2 methylated CpGs in theoriginal library molecule), or the second adaptor at both ends (internalfragments generated from 3 or more CpGs in the original librarymolecule). Analysis of the resulting products allows the determinationof methylation patterns of CpGs of interest. Alternatively, theamplicons can be analyzed on a microarray or by sequencing to isolatenovel CpG sequences that are methylated, for example.

Example 27 Preparation and Amplification of Whole Genome Libraries fromBisulfite-converted DNA using ‘Resistant’ Adaptors and a LigationReaction

This example describes a method for the creation of a whole genomelibrary prior to bisulfite conversion. Amplification of the convertedlibrary is performed following bisulfite conversion using universalpriming sequences attached during library preparation. This method isoutlined in FIG. 49. Genomic DNA is randomly fragmented, and adaptorsthat are resistant to bisulfite modification are attached to the ends ofthe DNA fragments. There are two types of bisulfite-resistant adaptorsthat can be utilized during ligation, and these are illustrated in FIG.50. The first type of adaptor comprises an oligo that is ligated to thefragmented DNA (oligo 1) that has no cytosines present, but onlyguanine, adenine, and thymine. Following ligation, an extension reactionis performed using dTTP, dATP, and dmCTP, resulting in incorporation ofbisulfite-resistant methylated cytosines complementary to the guaninesin oligo 1. Thus, the attached adaptor sequence is resistant tobisulfite modification due to the absence of unmethylated cytosines. Thesecond type of adaptor comprises methylated cytosines in oligo 1, alongwith adenine and thymine, but no guanine. Fill-in of the 3′ ends of theligated adaptor results in incorporation of thymine, guanine andadenine, but no cytosine. Thus, these ligated adaptors are alsoresistant to bisulfite conversion as they do not contain anyunmethylated cytosines. Bisulfite conversion is carried out on theresulting libraries. The library molecules are subsequently amplifiedusing the universal primer. The products of amplification can beanalyzed by any traditional means of methylation-specific analysis,including MS-PCR and sequencing.

Example 28 Optimization of the Cleavage of Genomic DNA by theMethylation-sensitive Restriction Enzyme AciI

This example illustrates the increased restriction enzyme cleavageefficiency observed after pre-heating genomic DNA, and specifically asit pertains to cleavage by the restriction enzyme Aci-I within theGC-rich promoter regions. GC-rich DNA sequences, through interactionswith proteins, may form alternative (non-Watson-Crick) DNAconformation(s) that are stable even after protein removal and DNApurification. These putative DNA structures could be resistant torestriction endonuclease cleavage and affect the performance of themethylation assay. Heating DNA to sub-melting temperatures reduces theenergetic barrier and accelerates the transition of DNA from anon-canonical form to a classical Watson-Crick structure.

Aliquots of 200 ng of purified genomic DNA purchased from the CoriellInstitute for Medical Research (repository # NA14657) were pre-heatedfor 30 minutes at 85° C., 90° C., or 95° C. in 50 μl of 1× NEBuffer 3(50 mM Tris-HCl, 10 mM MgCl₂, 100 mM NaCl, 1 mM Dithiothreitol, pH 7.9at 25° C.). Samples were cooled to 37° C. and digested with 10 units ofAci-I (NEB) for 18 hours at 37° C. Control non-digested DNA and DNA thathas not been pre-heated were also run in parallel.

The effect of pre-heating genomic DNA on cleavage efficiency wasevaluated using a PCR assay with primers flanking three Aci-I enzymerecognition sites within the CpG rich promoter region of the human p16gene. Aliquots of 20 ng of each DNA sample were analyzed by quantitativereal-time PCR in reaction mixtures containing: 1× Titanium Taq reactionbuffer (Clontech), 200 μM of each dNTP, 4% DMSO, Fluorescein (1:100,000)and SYBR Green I (1:100,000), 200 nM each p16 forward and reverse primer(SEQ ID NO: 48 and SEQ ID NO: 49), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 50 Reactions were initiatedat 95° C. for 3 min followed by 40 cycles at 94° C. for 15 sec and 68°C. for 1 min.

As shown on FIG. 52, pre-heating genomic DNA at 95° C. prior to Aci-Idigest resulted in reduced cleavage shown as a left shiftedamplification profile indicative of a greater starting concentration oftemplate. Heating at 90° C. had almost no effect on cleavage, whereaspre-heating at 85° C. improved the cleavage by about a factor of 2compared to control that was not pre-heated. This improvement ofcleavage by pre-heating at 85° C. was confirmed for multiple sites andmultiple restriction enzymes (results are not shown) and is routinelyused in our protocols for optimal digestion of genomic DNA withmethylation-sensitive restriction enzymes.

The improved digestion following heat pre-treatment of genomic DNAsuggests that a substantial fraction of DNA after purification maycontain non-canonical nuclease-resistant structures. Upon heating, thesestructures may be converted into standard restriction enzyme cleavableform. Heating should not exceed the melting temperature that could causeDNA denaturation and a complete or partial loss of DNA cleavability bythe restriction enzymes. In the experiment presented in FIG. 52, thereduced DNA cleavage after pre-heating at 90° C. and 95° C. is mostlikely a consequence of thermally-induced partial DNA denaturation.

Example 29 Methylation Analysis of 24 Promoter Regions in Random-primedLibraries Prepared from Kg1-a Leukemia Cell Line DNA after SimultaneousCleavage with 5 Methylation-sensitive Restriction Enzymes (methylomeLibraries)

This example demonstrates the utility of the Methylome librariesprepared from DNA digested with a mixture of 5 methylation-sensitiverestriction enzymes. The libraries were prepared by incorporatinguniversal sequence using primers comprising the universal sequence attheir 5′-end and a degenerate non-self-complementary sequence at their3′-end in the presence of DNA polymerase with strand-displacementactivity. The Methylome libraries were amplified by PCR and used foranalysis of the methylation status of promoter regions for 24 genesimplicated in cancer.

The invention employs the use of several (5) methylation-sensitiverestriction enzymes to convert intact non-methylated CpG-rich DNAregions into restriction fragments that fall below the minimum lengthcompetent for amplification by random-primed whole genome amplification(WGA)(U.S. patent application Ser. No. 10/293,048, filed Nov. 13, 2002,now U.S. Pat. No. 7,655,791), while methylated CpG-rich regionsresistant to digestion are efficiently amplified. The invention relieson the simultaneous use of all 5 or more restriction enzymes in oneoptimized reaction buffer described below. Although many restrictionenzymes are predicted to follow a one-dimensional diffusion mechanismafter binding DNA, the buffer conditions and methylation sensitiveenzyme mix specified in the invention show no detectable interferencebetween different restriction endonucleases.

The importance of implementation of multiple methylation-sensitiverestriction enzymes in methylome library preparation stems from theanalysis of promoter regions in the human genome. The spatialdistribution of methylation sensitive restriction sites that includerestriction endonucleases with 4 and 5 base recognition sites such as,for example, Aci I, Bst UI, Hha I, HinP1 I, Hpa II, Hpy 991, Hpy CH4 IV,Ava I, Bce AI, Bsa HI, Bsi E1, and Hga I closely mimics the distributionof the CpG dinucleotides in these regions. When DNA is incubated with asingle methylation sensitive enzyme, the resulting digestion isincomplete with many restriction sites remaining uncut. Factorscontributing to this phenomenon are likely the extremely high GC-contentand potential for alternative secondary structure. As a result, DNApre-treated with one restriction enzyme may still contain substantialamounts of uncut non-methylated sites. Co-digestion of DNA with acocktail of 5 or more methylation-sensitive restriction enzymes resultsin efficient conversion of all non-methylated CpG island into very smallDNA fragments while leaving completely methylated CpG regions intact.Subsequently, whole genome amplification (WGA) of DNA pre-treated withthe restriction enzyme cocktail results in amplification of all DNAregions except the CpG- and restriction site-rich regions that were notmethylated in the original DNA. These regions are digested intofragments that fail to amplify using the random-primed WGA method.Multiple-enzyme-mediated depletion of non-methylated promoter regions inthe amplified methylome library is so efficient that non-methylatedCpG-rich regions can not be detected by PCR. Those regions encompassingdensely methylated CpG islands are not affected by the enzyme cocktailtreatment and are efficiently amplified by the WGA process and can belater easily detected and quantified by real-time PCR.

To synthesize whole methylome libraries, genomic DNA isolated bystandard procedures from the exemplary KG1-A leukemia cell line orcontrol genomic DNA (Coriell repository # NA16028) was preheated at 80°C. for 20 min (see Example 28) in 50 μl reactions containing 1× NEBuffer4 (NEB) and 500 ng DNA. Samples were cooled to 37° C. for 2 min and 6.6units each of Acil and HhaI, and 3.3 units each of BstUI, Hpall, andHinp1I (NEB) were added. Sample digestions were incubated for 18 hoursat 37° C., followed by 2 hours at 60° C. Blank controls containing norestriction enzymes were also run in parallel. The DNA was precipitatedwith ethanol in the presence of 1.5 M ammonium acetate and 50 μg/ml ofglycogen, washed with 75% ethanol, air dried, and resuspended in 50 μlof TE-L buffer.

Aliquots of 30 ng of each digested or non-digested DNA sample wererandomly fragmented in TE-L buffer by heating at 95° C. for 4 minutesand subjecting them to library synthesis. The reaction mixturescomprised 30 ng of fragmented DNA in 1× EcoPol buffer (NEB), 200 μM ofeach dNTP, 200 μM of 7-deaza-dGTP (Sigma), 4% DMSO, 360 ng of SingleStranded DNA Binding Protein (USB), and 1 μM of K(N)2 primer (SEQ ID NO:14) in a final volume of 14 μl. After denaturing for 2 minutes at 95°C., the samples were cooled to 24° C., and the reactions were initiatedby adding 2.5 units of Klenow Exo- DNA polymerase (NEB). Samples wereincubated at 24° C. for 1 hour and terminated by heating for 5 minutesat 75° C. Aliquots of 10 ng of each sample were amplified byquantitative real-time PCR in a reaction mixture comprising thefollowing final concentrations: 1× Titanium Taq reaction buffer(Clontech), 200 μM of each dNTP, fluorescein calibration dye (1:100,000)and SYBR Green I (1:100,000), 1μM universal K_(U) primer (SEQ ID NO:15), 4% DMSO, 200 μM 7-deaza-dGTP (Sigma), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 50 ul. Reactions were carriedout at 95° C. for 1 min, followed by 14 cycles of 94° C. for 15 secondsand 65° C. for 2 minutes on an I-Cycler real-time PCR instrument(Bio-Rad). Amplified libraries were purified using the QIAQUICK® kit(Qiagen) and quantified by optical density reading.

The presence of methylated DNA within 24 exemplary cancer gene promoterswas analyzed by quantitative real-time PCR using amplified libraries anda panel of 40 specific primer pairs. Primers were designed to test thelibraries for amplicons spanning CpG-rich regions within promoters. Thepresence or absence of amplification for specific sequences that displaya high frequency of potential cleavage sites was indicative of themethylation status of the promoter. Initially, a set of 24 exemplarypromoters frequently implicated in different types of cancer wereevaluated. The primer pairs used in the PCR assays are listed in TableIV.

TABLE IV METHYLATION PROFILE OF EXEMPLARY KG1-A LEUKEMIA CELL LINEPromoter Sequence (5′-3′)* Position** Methylation*** P16 FGGTAGGGGGACACTTTCTAGTC (SEQ ID NO: 48) Upstream + (CDKN2A) RAGGCGTGTTTGAGTGCGTTC (SEQ ID NO: 49) FGGTGCCACATTCGCTAAGTGC (SEQ ID NO: 65) Downstream - RGCTGCAGACCCTCTACCCAC (SEQ ID NO: 66) P15 FCCTCTGCTCCGCCTACTGG (SEQ ID NO: 97) Flanking + (CDKN2B) RCACCGTTGGCCGTAAACTTAAC (SEQ ID NO: 98) E-Cadherin FGCTAGAGGGTCACCGCGT (SEQ ID NO: 28) Upstream + RCTGAACTGACTTCCGCAAGCTC (SEQ ID NO: 29) FGCTAGAGGGTCACCGCGT (SEQ ID NO: 28) Flanking + RCAGCAGCAGCGCCGAGAGG (SEQ ID NO: 67) GSTP-1 FGTGAAGCGGGTGTGCAAGCTC (SEQ ID NO: 30) Upstream - RGAAGACTGCGGCGGCGAAAC (SEQ ID NO: 31) MGMT FGCACGCCCGCGGACTA (SEQ ID NO: 99) Upstream + RCCTGAGGCAGTCTGCGCATC (SEQ ID NO: 100) FGCCCGCGCCCCTAGAACG (SEQ ID NO: 101) Downstream +/- RCACACCCGACGGCGAAGTGAG (SEQ ID NO: 102) RASSF-1 FGCCCAAAGCCAGCGAAGCAC (SEQ ID NO: 103) Flanking - RCGCCACAGAGGTCGCACCA (SEQ ID NO: 104) hMLH-1 FTCCGCCACATACCGCTCGTAG (SEQ ID NO: 105) Upstream - RCTTGTGGCCTCCCGCAGAA (SEQ ID NO: 106) BRCA-1 FCCCTTGGTTTCCGTGGCAAC (SEQ ID NO: 107) Flanking - RCTCCCCAGGGTTCACAACGC (SEQ ID NO: 108) VHL FCTAGCCTCGCCTCCGTTACAAC (SEQ ID NO: 109) Upstream - RGCTCGGTAGAGGATGGAACGC (SEQ ID NO: 110) APC-A1 FGGTACGGGGCTAGGGCTAGG (SEQ ID NO: 111) Flanking - RGCGGGCTGCACCAATACAG (SEQ ID NO: 112) FCGGGTCGGGAAGCGGAGAG (SEQ ID NO: 113) Downstream - RTGGCGGGCTGCACCAATACAG (SEQ ID NO: 114) DAPK-1 FGTGAGGAGGACAGCCGGACC (SEQ ID NO: 115) Downstream + RGGCGGGAACACAGCTAGGGA (SEQ ID NO: 116) TIMP-3 FAGGGGCACGAGGGCTCCGCT (SEQ ID NO: 117) Flanking + RGGGCAAGGGGTAACGGGGC (SEQ ID NO: 118) FCAGCTCCTGCTCCTTCGCC (SEQ ID NO: 119) Downstream + RGCTGCCCTCCGAGTGCCC (SEQ ID NO: 120) ESR-1 FCTGGATCCGTCTTTCGCGTTTA (SEQ ID NO: 121) Upstream + RTTGTCGTCGCTGCTGGATAGAG (SEQID NO: 122) FGGCGGAGGGCGTTCGTC (SEQ ID NO: 123) Downstream + RAGCACAGCCCGAGGTTAGAGG (SEQ ID NO: 124) MYOD-1 FCCTGATTTCTACAGCCGCTCTAC (SEQ ID NO: 125) Upstream + RTCCAAACCTCTCCAACACCCGACT (SEQ ID NO: 126) FCCTGGCCGAGAAGCTAGGG (SEQ ID NO: 127) Flanking + RCGGCCTGATTTGTGGTTAAGGA (SEQ ID NO: 128) CALCA FAGTTGGAAGAGTCCCTACAATCCTG (SEQ ID NO: 129) Upstream + RCGTCCCACTTGTATTTGCATTGAG (SEQ ID NO: 130) FCTGGCGCTGGGAGGCATCAG (SEQ ID NO: 131) Flanking + RGCGGGAGGTGGCTTGGATCA (SEQ ID NO: 132) CHFR FCGTGATCCGCAGGCGACGAA (SEQ ID NO: 133) Upstream - RTCACCAAGAGCGGCAGCTAAAG (SEQ ID NO: 134) FGAAGTCGCCTGGTCAGGATCAAA (SEQ ID NO: 135) Flanking - RGCCGCTGTCAAGAGACATTGC (SEQ ID NO: 136) PTGS-2 FCGGTATCCCATCCAAGGCGA (SEQ ID NO: 137) Upstream - RCTCTCCTCCCCGAGTTCCAC (SEQ ID NO: 138) MDR-1 FGTGGAGATGCTGGAGACCCCG (SEQ ID NO: 139) Downstream - RCTCTAGTCCCCCGTCGAAGCC (SEQ ID NO: 140) EDNRB FCGGGAGGAGTCTTTCGAGTTCAA (SEQ ID NO: 141) Upstream + RCGGGAGGAATACAGACACGTCTT (SEQ ID NO: 142) FGGGCATCAGGAAGGAGTTTCGAC (SEQ ID NO: 143) Downstream + RTCGCCAGTATCCACGCTCAA (SEQ ID NO: 144) RARf3-2 FAAAGAAAACGCCGGCTTGTG (SEQ ID NO: 145) Upstream + RCTACCCGGGCTGCTAACCTTCA (SEQ ID NO: 146) FGGACTGGGATGCCGAGAAC (SEQ ID NO: 147) Flanking + RTTTACCATTTTCCAGGCTTGCTC (SEQ ID NO: 148) RUNX-3 FGGGGCTCCGCCGATTG (SEQ ID NO: 149) Upstream - RCGCAGCCCCAGAACAAATCCT (SEQ ID NO: 150) FGGCCCCGCCACTTGATTCT (SEQ ID NO: 151) Flanking - RCGGCCGCCCCTCGTG (SEQ ID NO: 152) F CCGGGACAGCCACGAGGG (SEQ ID NO: 153)Downstream - R GCGAGAAGCGGGAAAGCAGAAGC (SEQ ID NO: 154) TIG-1 FCCAACTTTCCTGCGTCCATGC (SEQ ID NO: 155) Flanking + RAGGCTGCCCAGGGTCGTC (SEQ ID NO: 156) FCTCGCGCTGCTGCTGTTGCTC (SEQ ID NO: 157) Downstream + RTGAGGCTGCCCAGGGTCGTCGG (SEQ ID NO: 158) CAV-1 FGGGACGCCTCTCGGTGGTT (SEQ ID NO: 159) Upstream - RGGCCCGGACGTGTGCT (SEQ ID NO: 160) F CCTGCTGGGGGTTCGAAGA (SEQ ID NO: 161)Downstream + R CCCCTGCCAGACGCCAAGAT (SEQ ID NO: 162) CD44 FTCGGTCATCCTCTGTCCTGACGC (SEQ ID NO: 163) Upstream - RGGGGAACCTGGAGTGTCGC (SEQ ID NO: 164) FCCTCTGCCAGGTTCGGTCC (SEQ ID NO: 165) Downstream - RGCTGCGTGCCACCAAAACTTGTC (SEQ ID NO: 166) *F = Forward Primer, R =Reverse Primer **Position of amplicon relative to the gene transcriptionstart ***Methylation status of promoter sites as determined by therelative positions of amplification curves of libraries from digestedcancer DNA (C-C, Cancer Cut) and normal DNA (N-C, Normal Cut) asillustrated in FIG. 53. ″+″ designates a complete curve shift (completemethylation), ″+/-″ designates a partial shift (partial methylation),and ″-″ designates no shift (no methylation)

50 ng DNA aliquots from the amplified libraries were analyzed byquantitative real-time PCR in reaction mixtures comprising thefollowing: 1× Titanium Taq reaction buffer (Clontech), 200 μM of eachdNTP, 4% DMSO, 0.5 M betaine, FCD (1:100,000) and SYBR Green I(1:100,000), 200 nM each forward and reverse primer (Table IV), and 5units of Titanium Taq polymerase (Clontech) in a final volume of 50 μlat 95° C. for 3 min followed by 45 cycles at 94° C. for 15 sec and 68°C. for 1 min.

FIG. 53 shows typical amplification curves of completely methylated,partially methylated, and non-methylated promoter sites in KG1-A cellline as exemplified by the promoters for the human TIG-1, MGMT, andBRCA-1 genes respectively.

Example 30 Preparation and Labeling of Secondary Methylome LibrariesEnriched in Methylated Cpg-islands for Microarray Hybridization

This example demonstrates preparation of what may be termed a “SecondaryMethylome” library derived from the amplified primary Methylome library.Secondary libraries are derived by cleavage of the primary library withthe same set of methylation-sensitive restriction endonucleases used inpreparation of primary library and subsequent amplification of theexcised short DNA fragments. Restriction sites originally methylated inthe DNA sample were refractory to cleavage in the primary library,however after amplification substituting the 5′-methyl cytosines of thestarting template DNA with non-methylated cytosines of the primarylibarary DNA conveys cleavage sensitivity to these previously protectedrestriction sites. Incubation of the amplified primary library with theexemplary restriction endonuclease set (Aci I, Hha I, HinP1 I, or HpaII) would have no effect for amplicons lacking those restriction sites,produce a single break for amplicons with one site, and release one ormore restriction fragments from CpG-rich amplicons with two or morecorresponding restriction sites. Selective ligation of adaptors(comprising 5′-CG-overhangs complementary to the ends of Aci I, Hha I,HinP1 I, and Hpa II restriction fragments, or blunt-end adaptorscompatible with the ends of fragments produced by Bst UI) and subsequentamplification of the ligation products by PCR results in amplificationof only those DNA fragments that were originally flanked by twomethylated restriction sites. Secondary Methylome libraries generated bydifferent restriction enzymes can be mixed together to produce aredundant secondary Methylome library containing overlapping DNArestriction fragments originating from the methylated CpG islandspresent in the sample. These libraries are highly enriched formethylated sequences and can be analyzed by hybridization to a promotermicroarray or by real-time PCR using very short PCR amplicons.

An example of the process and resulting data are presented here indetail. Primary Methylome libraries were prepared from genomic DNAisolated by standard procedure from the LNCaP prostate cancer cell line(Coriell Institute for Medical Research) or from normal “non-methylated”DNA isolated from the peripheral blood of a healthy male donor. Sixtynanogram aliquots of cancer or normal DNA were pre-heated at 80° C. for20 min in 25 μl reactions comprising 1× NEBuffer 4 (NEB). Samples werecooled to 37° C. for 2 min and 3.3 units each of Acil and HhaI +1.67units each of BstUI, HpaII, and Hinp1I (NEB) were added. Samples werethen incubated for 14 hours at 37° C., followed by 2 hours at 60° C. TheDNA was precipitated with ethanol in the presence of 0.3 M sodiumacetate and 2 μl of PelletPaint (Novagen), washed with 75% ethanol, airdried, and resuspended in 20 μl of TE-L buffer. Aliquots of 30 ng ofeach digested DNA sample were randomly fragmented in TE-L buffer byheating at 95° C. for 4 minutes and subjected to library synthesis. Thereaction mixtures comprised 30 ng of fragmented DNA in 1× EcoPol buffer(NEB), 200 μM of each dNTP, 200 μM of 7-deaza-dGTP (Sigma), 4% DMSO, 360ng of Single Stranded DNA Binding Protein (USB), and 1μM of K(N)2 primer(SEQ ID NO: 14) in a final volume of 14 μl. After denaturing for 2minutes at 95° C., the samples were cooled to 24° C., and the synthesisreactions were initiated by adding 2.5 units of Klenow Exo-DNApolymerase (NEB). Samples were incubated at 24° C. for 1 hour andreactions were terminated by heating for 5 minutes at 75° C. Tennanograms aliquots of the libraries were then amplified by PCR with theuniversal primer (K_(U)) and product accumulation was monitored inreal-time with Sybr-green I. The amplification reaction mixturecomprised the following final concentrations: 1× Titanium Taq reactionbuffer (Clontech), 200 μM of each dNTP, fluorescein calibration dye(1:100,000) and SYBR Green I (1:100,000), 1μM universal K_(U) primer(SEQ ID NO: 15), 4% DMSO, 200 μM 7-deaza-dGTP (Sigma), and 5 units ofTitanium Taq polymerase (Clontech) in a final volume of 75 μl. Reactionswere carried out at 95° C. for 1 min, followed by 12 cycles of 94° C.for 15 seconds and 65° C. for 2 minutes on an I-Cycler real-time PCRinstrument (Bio-Rad). Amplified libraries from cancer or normal DNA werepooled and purified using MultiScreen PCR cleanup (Millipore) andquantified by optical density.

For preparation of secondary methylome libraries, 1.8 μg aliquots ofcancer and 1.8 μg aliquots of normal primary methylome library DNA weredigested in three separate tubes each in a final volume of 90 μl with22.5 units of AciI in NEBuffer 3, 15 units of Hpall in NEBuffer 4, or 30units of HhaI +15 units of Hinp1I in NEBuffer 4. Following pre-heatingat 80° C. for 20 min, the samples were cooled to 37° C. for 2 min andthe restriction enzymes were added at the amounts specified above.Samples were incubated for 16 hours at 37° C. and the enzymes wereinactivated for 10 min at 65° C. To size fractionate, the products ofthe three digestion reactions of cancer DNA and the products of thethree digestion reactions of normal DNA were combined, diluted to 1.32ml with dilution buffer (10 mM Tris-HCL, pH 8.0, 0.1 mM EDTA, and 150 mMNaCl), and aliquots of 440 μl were loaded on Microcon YM-100 filters(Millipore) that had been pre-washed with the above dilution buffer.Filters were centrifuged at 500×g for 20 minutes and the flow-throughfractions of cancer or normal samples were combined, precipitated withethanol in the presence of 0.3 M sodium acetate and 2 μl of PelletPaint(Novagen), washed with 75% ethanol, air dried, and resuspended in 36 μlof TE-L buffer. To convert the filtered fragments to an amplifiablesecondary library, Y1 and Y2 universal adaptors (Table V) comprisingunique sequences comprisng only C and T (non-Watson-Crick pairing bases)on one strand and having a CG 5′overhang on the opposite (A and G)strand were annealed and ligated to the overhangs of the restrictionfragments produced as described above. Digested and filtered library DNAfrom the previous step was incubated with Y1 and Y2 adaptors (Table V)each present at 0.6 μM and 1,200 units of T4 DNA ligase in 45 μl of 1×T4 DNA ligase buffer (NEB) for 50 min at 16° C. followed by 10 min at25° C. Libraries were then split into 3 aliquots of 15 μl each andamplified by PCR and monitored in real time using a reaction mixturecontaining final concentrations of : 1× Titanium Taq reaction buffer(Clontech), 200 μM of each dNTP, fluorescein calibration dye (1:100,000)and SYBR Green I (1:100,000), 0.25 μM each of universal primers (TableV, SEQ ID NO: 168 and SEQ ID NO: 170), 4% DMSO, 200 μM 7-deaza-dGTP(Sigma), and 5 units of Titanium Taq polymerase (Clontech) in a finalvolume of 75 μl. After an initial incubation at 75° C. for 10 min tofill-in the recessed 3′ends of the ligated restriction fragments,amplifications were carried out at 95° C. for 3 min, followed by 13cycles of 94° C. for 15 sec and 65° C. for 1.5 min on an I-Cyclerreal-time PCR instrument (Bio-Rad). Amplified libraries from cancer ornormal DNA were pooled and used as template in PCR labeling forsubsequent microarray hybridizations.

TABLE V OLIGONUCLEOTIDES AND ADAPTORS USED FORSECONDARY METHYLOME LIBRARIES PREPARATION AND ANALYSIS Code NameSequence* (5′-3′ unless otherwise indicated) Y1 Adaptor5′-CGAGAGAAGGGAx ** (SEQ ID NO: 167) TCTCTTCCCTCTCTTTCC-5′(SEQ ID NO: 168) Y2 Adaptor 5′-CGAAGAGAGAGGGx (SEQ ID NO: 169)TTCTCTCTCCCTTCCTTC-5′ (SEQ ID NO: 170) GSTP-1 (SH)F AGTTCGCTGCGCACACTT (SEQ ID NO: 190)R CGGGGCCTAGGGAGTAAACA (SEQ ID NO: 191) RASSF-1 (SH)F CCCAAAGCCAGCGAAGCACG (SEQ ID NO: 192)R TCAGGCTCCCCCGACAT (SEQ ID NO: 193) CD44 (SH)F CTGGGGGACTGGAGTCAAGTG (SEQ ID NO: 194)R CCAACGGTTTAGCGCAAATC (SEQ ID NO: 195) P16 (SH)F CTCGGCGGCTGCGGAGA (SEQ ID NO: 196)R CGCCGCCCGCTGCCT (SEQ ID NO: 197) * F = Forward Primer, R = ReversePrimer ** x = amino C7 modifier

Libraries were labeled during PCR by incorporation of universal primerscontaining 5′ cyanine fluorophores. Labeling reactions were as follows:1 x Titanium Taq reaction buffer (Clontech), 200 μM of each dNTP,fluorescein calibration dye (1:100,000) and SYBR Green I (1:100,000),0.25 μM each of Cy5 or Cy3 5′-labeled universal primers (Table V, SEQ IDNO: 168 and SEQ ID NO: 170), 4% DMSO, 200 μM 7-deaza-dGTP (Sigma), 5units of Titanium Taq polymerase (Clontech), and 1.5 μl of library DNAfrom the previous step in a final volume of 75 μl. Reactions werecarried out at 95° C. for 3 min, followed by 8 cycles of 94° C. for 15sec and 65° C. for 1.5 min on an I-Cycler real-time PCR instrument(Bio-Rad). Cancer DNA was labeled with Cy3 and normal with Cy5. Multiplelabeling reactions were pooled, diluted with 4 volumes of TE-L bufferand purified using MultiScreen PCR cleanup (Millipore). The purifiedlabeled DNA was quantified by optical density.

The distribution of promoter sites and the level of their enrichment inamplified secondary methylome libraries from cancer DNA was analyzed byquantitative PCR using primer pairs amplifying short amplicons that donot contain recognition sites for at least two of themethylation-sensitive restriction enzymes employed in the presentexample (Table V, SEQ ID NOS: 190 through SEQ ID NO: 197). Mechanicallyfragmented genomic DNA from the peripheral blood of a healthy donor wasused as a control for relative copy number evaluation.

Aliquots of 50 ng of amplified secondary methylome libraries preparedfrom LNCaP cell line or control genomic DNA fragmented to an averagesize of 1.5 Kb on a HydroShear device (Gene Machines) for 20 passes at aspeed code of 3 were analyzed by quantitative real-time PCR in reactionmixtures containing: 1× Titanium Taq reaction buffer (Clontech), 200 μMof each dNTP, 4% DMSO, FCD (1:100,000) and SYBR Green I (1:100,000), 200nM each forward and reverse primer (Table V, SEQ ID NO: 190 and SEQ IDNO: 191 for GSTP-1 promoter, SEQ ID NO: 192 and SEQ ID NO: 193 forRASSF-1 promoter, SEQ ID NO: 194 and SEQ ID NO: 195 for CD44 promoter,and SEQ ID NO: 196 and SEQ ID NO: 197 for p16 promoter), and 3 units ofTitanium Taq polymerase (Clontech) in a final volume of 30 μl at 95° C.for 3 min followed by 47 cycles at 94° C. for 15 sec and 68° C. for 1min.

FIG. 66 shows typical amplification curves of four promoter sites threeof which (GSTP-1, RASSF-1, and CD44) are methylated, and one(p16) thatis not methylated in the exemplary LNCaP cell line. For methylatedpromoters, between 4 and 7 cycles of left shift (enrichment of between16 and 128-fold) of the amplification curves from methylome library wasobserved relative to the curve corresponding to control non-amplifiedgenomic DNA. For the non-methylated p16 promoter a curve delayedapproximately 4 cycles relative to the control appeared. However, thiscurve did not correspond to the correct size amplicon and was mostlikely a product of mis-priming.

Example 31 Preparation of Libraries from Cell-free Dna Isolated fromSerum and Urine and their Utility for Detection of PromoterHypermethylation

This example describes a method for preparation of libraries from thecell-free DNA fraction of serum or urine and their utility for detectionof promoter hypermethylation. The principle of this method is describedin U.S. patent application Ser. No. 10/797,333, filed Mar. 8, 2004,published as U.S. Patent Application Publication No.: 2004/0209299 andis now abandoned.

Cell-free DNA isolated from plasma, serum, and urine is typicallycharacterized by very low amounts (nanogram quantities) of extremelyshort size (−200 bp). In principle, the random-prime amplificationmethod described in Example 29 can be applied to DNA of this size butwith about 10 times lower amplification efficiency compared to highmolecular weight DNA isolated from tissue or cultured cells. Analternative and more efficient method of preparing Methylome librariesfrom very short DNA fragments utilizes elements of the invention. As inthe above examples, a simultaneous digestion of DNA in one reactionbuffer with multiple (five or more) methylation-sensitive restrictionendonucleases is followed by whole genome amplification from universalsequences attached to DNA fragments by ligation. With this methylomelibrary approach, DNA can be digested with the nuclease cocktail beforeor after library synthesis. In this detailed Example, themulti-endonuclease cleavage occurrs post library synthesis, and ensuresthat the amplicons containing multiple non-methylated restriction siteswill be efficiently eliminated by cleavage and thereby not amplified.

Blood collected from healthy donors or from prostate cancer patients wasaliquoted into 6 ml Vacutainer SST Serum Separation tubes(Becton-Dickinson), incubated for 30 min at ambient temperature, andcentrifuged at 1,000×g for 10 min. The upper serum phase was collectedand stored at −20° C. until use. DNA was isolated using Charge SwitchKit (DRI cat # 11000) and a modified protocol for DNA from blood. One mlof serum was incubated with 700 ul of lysis buffer provided with thekit, 30 μl of proteinase K (20 mg/ml), and 5 μl of RNase A/T1 cocktail(Ambion cat # 2288) by incubation at 25° C. for 20 min with gentlerotation. Two hundred and fifty μl purification buffer and 30 μl ofMagnetic beads were then added to each sample followed by incubation at25° C. for 2 min. Tubes were placed on magnetic rack for 2 min.Supernatant was removed and beads were washed 3 times with 1 ml each ofwashing buffer. Beads were then resuspended in 40 μl of elution bufferand incubated at 25° C. for 2 min. Samples were placed on magnetic rackfor 2 min and supernatant was transferred to a new tube. DNA wasquantified on fluorescent spectrophotometer using Pico Green (MolecularProbes) and λ phage DNA standards.

Another source of cell-free DNA for methylome preparation was isolatedfrom urine of healthy donors or from prostate cancer patients collectedin 50 ml Falcon tubes and stabilized for storage by adding 0.1 volume of0.5 M EDTA. Urine samples were centrifuged at 1,800× g for 10 min atambient temperature to sediment cells and supernatant was transferredcarefully to a fresh tube. An equal volume of 6 M guanidine thiocyanatewas added to each sample followed by ⅙ vol of Wizard Miniprep resin(Promega catalog # A7141). DNA was bound to the resin by rotation for 1hour at ambient temperature. The resin was then sedimented by briefcentrifugation at 500×g and loaded on Wizard minicolumns (Promegacatalog #A7211)) using syringe barrel extensions after carfullydecanting out the supernatant. Resin was washed with 5 ml of wash buffer(Promega catalog # A8102) using Qiagen QlAvac 24 vacuum manifold.Minicolumns were then centrifuged for 2 min at 10,000× g to removeresidual wash buffer and bound DNA was eluted with 50 μl of DNAse-freewater at 10,000× g for 1 min. Eluted DNA was buffered by adding 0.1 volof 10× TE-L buffer and quantified by fluorescent spectrophotometer usingPico Green (Molecular Probes) and λ phage DNA standards. FIG. 54 A showsanalysis of DNA samples isolated from serum and urine by gelelectrophoresis on 1.5% agarose. A typical banding patterncharacteristic of apoptotic nucleosomal size is observed.

To repair DNA ends 100 ng aliquots of purified cell-free serum or urineDNA were incubated in 1× T4 ligase buffer (NEB) with 0.8 units of Klenowfragment of DNA polymerase I (USB Corporation), 0.1 mg/ml of bovineserum albumin (BSA), and 16.7 μM dNTPs for 15 min at 25° C. followed by10 min at 75° C. in a final volume of 24 μl.

For preparation of methylome libraries repaired DNA was incubated withuniversal K_(U) adaptor (Table VI) at 1.25 μM and 800 units of T4 DNAligase in 32 μl of 1× T4 DNA ligase buffer (NEB) for 1 hour at 25° C.followed by 15 min at 75° C. DNA was precipitated with ethanol in thepresence of 0.3 M sodium acetate and 2 μl of PelletPaint (Novagen),washed with 75% ethanol, air dried, and resuspended in 34.4 μl ofDNAase-free water. Samples were then supplemented with 4 μl of 10×NEBuffer 4 (NEB) and split into 2 aliquots. Following pre-heating at 70°C. for 5 min and cooling to 37° C. for 2 min, one aliquot was digestedwith 2.66 units each of AciI and HhaI, and 1.33 units each of BstUI,HpaII, and Hinp1I (NEB) for 12 hours at 37° C., followed by 2 hours at60° C. in a final volume of 20 μl. The second aliquot was incubated inparallel but without restriction enzymes (“uncut” control). Librarieswere amplified using real-time PCR monitoring in a reaction mixturecomprising the following final concentrations: 1× Titanium Taq reactionbuffer (Clontech), 200μM of each dNTP, fluorescein calibration dye(1:100,000) and SYBR Green I (1:100,000), 1μM universal primer K_(U)(Table VI, SEQ ID NO: 15), 4% DMSO, 200 μM 7-deaza-dGTP (Sigma), and 5units of Titanium Taq polymerase (Clontech) in a final volume of 75After initial incubation at 75° C. for 15 min to fill in the recessed3′ends of the ligated DNA libraries, amplifications were carried out at95° C. for 3 min, followed by 13 cycles of 94° C. for 15 sec and 65° C.for 2 min on an I-Cycler real-time PCR instrument (Bio-Rad). Amplifiedlibraries were purified using MultiScreen PCR cleanup (Millipore) andquantified by optical density. FIG. 54 B shows analysis of DNA fromlibraries prepared from urine by electrophoresis on 1.5% agarose gels.

TABLE VI OLIGONUCLEOTIDE ADAPTORS USED FOR PREPARATIONOF METHYLOME LIBRARIES FROM SERUM AND URINE DNA Code Sequence*K_(U) Adaptor 5′-CCAAACACACCCx-3′ (SEQ ID NO: 171)3′-GGTTTGTGTGGGTTGTGT-5′ (SEQ ID NO: 15) dU-Hairpin5′-TGTGTTGGGdUGdUGTGTGGdUdUdUdUdUdUCCA Adaptor CACACACCCAACACA-3′(SEQ ID NO: 172)** Mu-1 Primer 5′-CCACACACACCCAACACA-3′(SEQ ID NO: 173) * x = amino C7 modifier ** dU = deoxy-Uridine

Specific regions within the library template DNA may show resistance todigestion based on their level of methylation. Promoter sequences richin CpG methylation are thereby quantified in the amplified Mehylomelibraries using quantitative real-time PCR assays with promoter specificprimers described in the Table VII. Aliquots of 75 ng of each DNA samplewere assayed by quantitative real-time PCR in reaction mixturescomprising the following: 1× Titanium Taq reaction buffer (Clontech),200 μM of each dNTP, 4% DMSO, 0.5 M betaine, FCD (1:100,000) and SYBRGreen I (1:100,000), 200 nM each forward and reverse primer (Table VII),and 2.5 units of Titanium Taq polymerase (Clontech) in a final volume of25 μl at 95° C. for 3 min followed by 50 cycles at 94° C. for 15 sec and68° C. for 1 min.

FIGS. 55 and 56 show typical amplification curves of promoter sites forgenes implicated in cancer from methylome libraries synthesized from theserum and urine DNA of cancer patients as compared to healthy donorcontrols. As expected, the level of methylation in serum and urine DNAfrom cancer patients was much lower than in tumor tissue or cancer celllines, since cancer DNA in circulation represents only a relativelysmall fraction of the total cell-free DNA. This trend is especiallypronounced for urine DNA. Nevertheless, the method disclosed here isvery sensitive to reliably detect methylation in body fluids and can beapplied as a diagnostic tool for early detection, prognosis, ormonitoring of the progression of cancer disease.

TABLE VII PRIMER PAIRS USED FOR METHYLATION ANALYSIS OFSERUM AND URINE METHYLOME LIBRARIES BY REAL-TIME PCR PromoterSequence (5′-3′) APC-1 F CGGGTCGGGAAGCGGAGAG (SEQ ID NO: 113)R TGGCGGGCTGCACCAATACAG (SEQ ID NO: 114) MDR-1F GGGTGGGAGGAAGCATCGTC (SEQ ID NO: 174)R GGTCTCCAGCATCTCCACGAA (SEQ ID NO: 175) BRCA-1F CCCTTGGTTTCCGTGGCAAC (SEQ ID NO: 107)R CTCCCCAGGGTTCACAACGC (SEQ ID NO: 108) CD44F CCTCTGCCAGGTTCGGTCC (SEQ ID NO: 165)R GCTGCGTGCCACCAAAACTTGTC (SEQ ID NO: 166) GSTP-1F TGGGAAAGAGGGAAAGGCTTC (SEQ ID NO: 176)B CCCCAGTGCTGAGTCACGG (SEQ ID NO: 177) RASSF-1F GCCCAAAGCCAGCGAAGCAC (SEQ ID NO: 103)R CGCCACAGAGGTCGCACCA (SEQ ID NO: 104) E-CadherinF GCTAGAGGGTCACCGCGT (SEQ ID NO: 28)R CTGAACTGACTTCCGCAAGCTC (SEQ ID NO: 29) PTGS-2F AGAACTGGCTCTCGGAAGCG (SEQ ID NO: 178)R GGGAGCAGAGGGGGTAGTC (SEQ ID NO: 179) EDNRBF GGGCATCAGGAAGGAGTTTCGAC (SEQ ID NO: 143)R TCGCCAGTATCCACGCTCAA (SEQ ID NO: 144) P16 Exon 2F GCTTCCTGGACACGCTGGT (SEQ ID NO: 180)R TCTATGCGGGCATGGTTACTG (SEQ ID NO: 181) * F = Forward primer, R =Reverse Primer

Example 32 Optimization of Library Preparation from Cell-free DNAIsolated from Urine

In clinical applications it is very important to have simple, fast, andreliable tests. This example describes the development of a single-tubelibrary preparation and amplification method for methylome librariesfrom cell-free urine DNA and its advantages over a two-step protocol.

Cell-free DNA was isolated and quantified from urine as described inExample 31. Aliquots of the purified DNA were processed for librarypreparation and amplification according to two different protocols asdescribed below.

In the two-step protocol, a 100 ng DNA aliquot was processed forenzymatic repair of termini by incubation in 1× T4 ligase buffer (NEB)with 0.8 units of Klenow fragment of DNA polymerase I (USB Corporation),0.1 mg/ml of BSA, and 16.7 μM dNTPs for 15 min at 25° C. followed by 10min at 75° C. in a final volume of 24 μl. For library preparation,repaired DNA was incubated with universal K_(U) adaptor (Table VI) at1.25 μM and 800 units of T4 DNA ligase in 32 μl of 1× T4 DNA ligasebuffer (NEB) for 1 hour at 25° C. followed by 15 min at 75° C. DNA wasprecipitated with ethanol in the presence of 0.3 M sodium acetate and 2μl of PelletPaint (Novagen), washed with 75% ethanol, air dried, andresuspended in 34.4 μl of DNAase-free water. The sample was thensupplemented with 4μl of 10x NEBuffer 4 (NEB) and split into 2 aliquots.Following pre-heating at 70° C. for 5 min and cooling to 37° C. for 2min one aliquot was digested with 2.66 units each of Acil and HhaI, and1.33 units each of BstUI, Hpall, and Hinp1I (NEB) for 12 hours at 37°C., followed by 2 hours at 60° C. in a final volume of 20 μl. The secondaliquot was incubated in parallel but without restriction enzymes(“uncut” control).

In the single-tube protocol, 100 ng DNA aliquot was processed forenzymatic repair of termini by incubation in 1× NEBuffer 4 (NEB) with0.8 units of Klenow fragment of DNA polymerase I (USB Corporation), 0.1mg/ml of BSA, and 16.7 μM dNTPs for 15 min at 25° C. followed by 10 minat 75° C. in a final volume of 24 μl. The sample of repaired DNA wassupplemented with universal K_(U) adaptor (Table VI) at a finalconcentration of 1.25 μM, 800 units of T4 DNA ligase, and 1 mM ATP in 1×NEBuffer 4 (NEB) added to a final volume of 32 μl. Ligation was carriedout for 1 hour at 25° C. followed by 15 min at 75° C. The sample wassplit into 2 aliquots of 16 μl each. Following pre-heating at 70° C. for5 min and cooling to 37° C. for 2 min, one aliquot was digested with 2units each of Acil and HhaI, and 1 unit each of BstUI, HpaII, and Hinp1I(NEB). Sample was incubated for 12 hours at 37° C., followed by 2 hoursat 60° C. The second aliquot was incubated in parallel but withoutrestriction enzymes(“uncut” control).

Libraries were amplified using quantitative real-time PCR monitoring bysupplementing the reactions with PCR master mix adding to the followingfinal concentrations: 1× Titanium Taq reaction buffer (Clontech), 200 μMof each dNTP, fluorescein calibration dye (1:100,000) and SYBR Green I(1:100,000), 1 μM universal primer K_(U) (Table VI, SEQ ID NO: 15), 4%DMSO, 200μM 7-deaza-dGTP (Sigma), and 5 units of Titanium Taq polymerase(Clontech) in a final volume of 75 1.11. After initial incubation at 75°C. for 15 min to fill-in the recessed 3′ends of the ligated DNAlibraries, amplifications were carried out at 95° C. for 3 min, followedby 13 cycles of 94° C. for 15 sec and 65° C. for 2 min on an I-Cyclerreal-time PCR instrument (Bio-Rad). Amplified libraries were purifiedusing MultiScreen PCR cleanup (Millipore) and quantified by opticaldensity.

The presence of methylated DNA in the sample template was exhibited byresistance to cleavage with the methylation-sensitive enzyme cocktailand representation in the resulting methylome libraries. Promotersequences in the amplified libraries were analyzed using quantitativereal-time PCR with primers to the relevant cancer genes (Table VII).Aliquots of 75 ng of each DNA sample were assayed by quantitativereal-time PCR in reaction mixtures containing: 1× Titanium Taq reactionbuffer (Clontech), 200 μM of each dNTP, 4% DMSO, 0.5 M betaine, FCD(1:100,000) and SYBR Green I (1:100,000), 200 nM each forward andreverse primer (Table VII), and 2.5 units of Titanium Taq polymerase(Clontech) in a final volume of 25 μl at 95° C. for 3 min followed by 50cycles at 94° C. for 15 sec and 68° C. for 1 min.

FIG. 57 shows typical amplification curves comparing libraries preparedwith the single tube protocol with the two step protocol. As shown, thecut samples from the single tube protocol had a greatly reducedbackground as compared to the two step protocol, whereas the uncutsamples amplified identically. This results in significant improvementof the dynamic range of the assay. Another apparent advantage of thesingle tube protocol is reduced hands-on time and improved highthroughput and automation capability.

Example 33 Establishing the Dynamic Range and Sensitivity Limits ofMethylation Detection in Urine Samples using Mixed Libraries ofArtficially Methylated and Non-methylated DNA

This example demonstrates the sensitivity range of methylation detectionin samples of free DNA in urine as disclosed in the present invention.

Cell-free DNA isolated from urine as described in Example 31 wasartificially methylated to completion at all CpG sites by incubating 50ng DNA in 10 μl of NEBuffer 2 (NEB) with 4 units of M.SssI CpG methylase(NEB) in the presence of 160 μM S-adenosylmethionine (SAM) for 1 hour at37° C.

Input urine DNA shown to be essentially non-methylated across the panelof promoters analyzed (results not shown) was used as a control.Artificially methylated and untreated control DNA samples were mixed atdifferent ratios to a final content of methylated DNA of 0%, 0.01%,0.1%, 1%, and 10%. Aliquots of each mix containing 50 ng of total DNAwere processed for library synthesis using an adaptation of the singlestep one tube protocol described in Example 32. Samples were incubatedin 1× NEBuffer 4 (NEB) with 0.36 units of T4 DNA polymerase (NEB), 2 μMof dU-Hairpin Adaptor (Table VI, SEQ ID NO: 172), 1 unit of uracil-DNAglycosylase (UDG), 800 units of T4 DNA ligase, 40 μM dNTPs, 1 mM ATP,and 0.1 mg/ml BSA for 1 hour at 37° C. in a final volume of 15 Thesamples were split into 2 equal aliquots and one aliquot was digestedwith 6.67 units each of AciI and HhaI, and 3.33 units each of BstUI,HpaII, and Hinp1I (NEB) for 12 hours at 37° C., followed by 2 hours at60° C. in 15 pi of NEBuffer 4. The second aliquot was incubated inparallel but without restriction enzymes (“uncut” control).

Libraries were amplified using quantitative real-time PCR monitoring bysupplementing the reactions with PCR master mix adding to finalconcentrations of: 1× Titanium Taq reaction buffer (Clontech), 200 μM ofeach dNTP, fluorescein calibration dye (1:100,000) and SYBR Green I(1:100,000), 1 μM universal primer M_(U)-i (Table VI, SEQ ID NO: 173),4% DMSO, 200 μM 7-deaza-dGTP (Sigma), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 75 Amplifications werecarried out at 95° C. for 5 min, followed by 15 cycles of 94° C. for 15sec and 65° C. for 2 min on an I-Cycler real-time PCR instrument(Bio-Rad). Amplified libraries were purified using MultiScreen PCRcleanup (Millipore) and quantified by optical density.

Methylation analysis was performed using real-time PCR with primersdirected to a segment of the human MDR-1 promoter. Aliquots of 75 ng ofeach digested or non-digested DNA sample were assayed by quantitativereal-time PCR in reaction mixtures containing: 1× Titanium Taq reactionbuffer (Clontech), 200 μM of each dNTP, 4% DMSO, 0.5 M betaine, FCD(1:100,000) and SYBR Green I (1:100,000), 200 nM each forward andreverse primer (Table VII, SEQ ID NO: 174 and SEQ ID NO: 175), and 2.5units of Titanium Taq polymerase (Clontech) in a final volume of 35 μlat 95° C. for 3 min followed by 50 cycles at 94° C. for 15 sec and 68°C. for 1 min.

As shown on FIG.58, as little as 0.01% of methylated DNA can be reliablydetected in the background of 99.99% of non-methylated DNA. The figurealso shows that the method disclosed in the present invention has adynamic range of at least 3 orders of magnitude.

Example 34 Comparison between Klenow Fragment of DNA Polymerase I and T4DNA Polymerase for their Ability to Preserve Methylation of C_(E)GIslands During Preparation of Methylome Libraries

Cell free DNA in urine or circulating in plasma and serum is likely tobe excessively nicked and damaged due to their natural apoptotic sourceand presence of nuclease activities in blood and urine. During repair ofends using DNA polymerase with 3′-exonuclease activity internal nicksare expected to be extended, a process that can potentially lead toreplacement of methyl-cytosine with non-methylated cytosine and loss ofthe methylation signature. The stronger the strand displacement (ornick-translation) activity of the polymerase, the more likely the5′-methyl cytosine would be replaced with normal cytosine during therepair process. This example compares two DNA polymerases capable ofpolishing DNA termini to produce blunt ends and the ability of each topreserve the methylation signature of CpG islands prior to cleavage withmethylation-sensitive restriction enzymes.

Cell-free DNA isolated from urine as described in Example 31 wasartificially methylated at all CpG sites by incubating 100 ng DNA in 50ul of NEBuffer 2 (NEB) with 4 units of M.SssI CpG methylase (NEB) in thepresence of 160 μM S-adenosylmethionine (SAM) for 1 hour at 37° C.

Two 50 ng aliquots of methylated DNA were processed for enzymatic repairof termini by incubation in 1× NEBuffer 4 (NEB) containing either 0.8units of Klenow fragment of DNA polymerase I (USB Corporation) or 0.48units of T4 DNA Polymerase (NEB), 0.1 mg/ml of BSA, and 26.7 μM dNTPsfor 15 min at 25° C. followed by 10 min at 75° C. in a final volume of30 Samples were supplemented with universal K_(U) adaptor (Table VI) ata final concentration of 1.25 μM, 800 units of T4 DNA ligase, and 1 mMATP in 1× NEBuffer 4 (NEB) added to a final volume of 38 μl. Ligationwas carried out for 1 hour at 25° C. followed by 15 min at 75° C. Thesamples were split into 2 aliquots of 19 μl each and one aliquot wasdigested with 10 units each of Acil and HhaI, and 5 units each of BstUI,Hpall, and Hinp1I (NEB for 12 hours at 37° C., followed by 2 hours at60° C. The second aliquot was incubated in parallel but withoutrestriction enzymes (“uncut” control).

Libraries were amplified and the process was monitored by quantitativereal-time PCR by supplementing the reactions with PCR master mix addedto final concentrations of: 1× Titanium Taq reaction buffer (Clontech),200 μM of each dNTP, fluorescein calibration dye (1:100,000) and SYBRGreen I (1:100,000), 1 μM universal primer K_(U) (Table VI, SEQ ID NO:15), 4% DMSO, 200 μM 7-deaza-dGTP (Sigma), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 75 After initial incubationat 75° C. for 15 min to fill-in the recessed 3′ends of the ligated DNAlibraries, amplifications were carried out at 95° C. for 3 min, followedby 12-14 cycles of 94° C. for 15 sec and 65° C. for 2 min on an I-Cyclerreal-time PCR instrument (Bio-Rad). Amplified libraries were purifiedusing MultiScreen PCR cleanup (Millipore) and quantified by opticaldensity.

The preservation of methylation signature for each repair process wasassessed by amplifying 4 human promoter sites from cut and uncutlibraries. Aliquots of 80 ng of each DNA sample were assayed byquantitative real-time PCR in reaction mixtures containing: 1× TitaniumTaq reaction buffer (Clontech), 200 μM of each dNTP, 4% DMSO, 0.5 Mbetaine, FCD (1:100,000) and SYBR Green I (1:100,000), 200 nM eachforward and reverse primer (Table IV, SEQ ID NO: 30, SEQ ID NO: 31, SEQID NO: 113, SEQ ID NO: 114, SEQ ID NO: 143, SEQ ID NO: 144, and TABLEVII, SEQ ID NOSEQ ID NO: 180 and SEQ ID NO: 181), and 2.5 units ofTitanium Taq polymerase (Clontech) in a final volume of 25 μl at 95° C.for 3 min followed by 50 cycles at 94° C. for 15 sec and 68° C. for 1min.

As shown on FIG. 59, when fully methylated urine DNA was treated withKlenow fragment of DNA polymerase I prior to restriction cleavage a 2-3cycle shift of the amplification curves was observed, suggesting that asignificant fraction (estimated 75% to 90%) of methyl-cytosine was lostduring the DNA end repair. On the other hand, when T4 polymerase wasused for DNA end repair, the shift was only one cycle or less dependingon the site analyzed. This suggests that 50% or more of themethyl-cytosine was preserved. These results are in agreement withliterature data showing that E. coli DNA polymerase I has strongerstrand-displacement activity than T4 polymerase. Thus, T4 DNA polymeraseis the enzyme of choice, to produce blunt ends for methylome librarypreparation from urine or other sources of degraded or nicked DNA.

Example 35 Sodium Bisulfite Conversion and Amplification of WholeMethylome Libraries Prepared by Ligation of Universal Adaptor Sequence

This example demonstrates that WGA libraries prepared by a modificationof the method described in U.S. patent application Ser. No. 10/797,333,filed Mar. 8, 2004, published as U.S. Patent Application PublicationNo.: 2004/0209299 and is now abandoned, can be converted with sodiumbisulfate and amplified to a scale suitable for genome-wide methylationstudies using, for example, a methylation-specific PCR method or otheravailable techniques. To protect the adaptor sequences from conversion,dCTP in the nucleotide mix is substituted with methyl-dCTP duringfill-in of 3′ library ends. The source of DNA can be urine, plasma,serum, feces, sputum, saliva, tissue biopsy, cultured cells, frozentissue, or any other source suitable for library preparation, forexample. This example demonstrates the application ofbisulfate-converted DNA libraries and their utility in conjunction withmethylation specific restriction digestion (as in Examples 29 and 31).Samples from sources such as serum or urine where a major fraction ofDNA may originate from normal cells, and wherein cancer DNA constitutesonly a very small fraction (less than 1%), may benefit from increasedsensitivity. Application of the invention in this form is particularlyimportant because it greatly reduces or may even completely eliminatenon-methylated DNA from the library. As a consequence, techniques otherthan MSP can be used to quantitatively analyze DNA methylation.

One hundred nanograms of non-methylated cell-free DNA isolated fromurine as described in Example 31 was processed for library preparationby incubation in 1× NEBuffer 4 (NEB) comprising 1.5 units of T4 DNAPolymerase (NEB), 0.1 mg/ml of BSA, and 100 μM each of dATP, dGTP, dTTP,and methyl-dCTP for 15 min at 25° C. followed by 10 min at 75° C. in afinal volume of 10 μl. Samples were supplemented with universal K_(U)adaptor (Table VI) at a final concentration of 1.43 μM, 400 units of T4DNA ligase, and 1 mM ATP in 1× NEBuffer 4 (NEB) added to a final volumeof 14 μl. Ligation was carried out for 1 hour at 25° C. followed by 15min at 75° C. To displace the short oligonucleotide of the adaptor(Table VI, SEQ ID NO: 171) and to fill-in the 3′ ends of the librarymolecules incorporating methyl-cytosine, 1.25 units of Titanium Taqpolymerase (BD-Clontech) were added and sample was incubated for 15 minat 72° C. The sample was diluted to 20 μl with water and 18 μl (90% ofthe total DNA) aliquot was processed for bisulfite conversion using EZDNA Methylation Kit (Zymo Research cat # D5001) following themanufacturer's protocol. The remaining 10% of the library was leftuntreated (non-converted control).

Aliquots of the converted library corresponding to 20 ng, 10 ng, 1 ng,and 0.1 ng and aliquots of the non-converted control corresponding to 3ng, 1 ng, and 0.1 ng were amplified by quantitative real-time PCR in areaction mixture containing the following final concentrations: 1×Titanium Taq reaction buffer (Clontech), 200 uM of each dNTP,fluorescein calibration dye (1:100,000) and SYBR Green I (1:100,000), 1μM universal K_(U) primer (SEQ ID NO: 15), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 50 ul. Reactions were carriedout at 95° C. for 1 min, followed by 23 cycles of 94° C. for 15 secondsand 65° C. for 2 minutes on an I-Cycler real-time PCR instrument(Bio-Rad). Amplified libraries were purified using MultiScreen PCRcleanup (Millipore) and quantified by optical density.

FIG. 60A shows real-time PCR amplification curves for a range of inputDNA from libraries of bisulfite converted and non-converted DNA. Thecalculated threshold cycle for each DNA amount was used to constructstandard curves by linear regression analysis (i-Cycler software,Bio-Rad). These calculations showed that approximately 30% of the DNAwas amplifiable after sodium bisulfite conversion.

To confirm the conversion of library DNA, the present inventorsperformed real-time PCR with modified human STS primers specific forconverted DNA that do not contain the CpG dinucleotide. Reactionmixtures comprised the following: 1× Titanium Taq reaction buffer(Clontech), 50 ng of converted or non-converted library DNA, 200 μM ofeach dNTP, FCD (1:100,000) and SYBR Green I (1:100,000), 200 nM eachforward and reverse primer (Table VIII), and 2.5 units of Titanium Taqpolymerase (Clontech) in a final volume of 25 μl at 95° C. for 3 minfollowed by 50 cycles at 94° C. for 15 sec and 68° C. for 1 min.

TABLE VIII PRIMER PAIRS USED FOR ANALYSIS OF BISULFITE-CONVERTED AMPLIFIED LIBRARIES UniSTS # Sequence (5′-3′) 175841F TTTGATGTTAGGATATGTTGAAA (SEQ ID NO: 182) R AAAAACAAAAAAAATCTCTTAAC(SEQ ID NO: 183) 170707 F ATTTACTACTTAATATTACCTAC (SEQ ID NO: 184)R TTATGTGTGGGTTATTAAGGATG (SEQ ID NO: 185)

As shown on FIG. 60B, real-time PCR curves from converted DNA were 8 to10 cycles earlier. Also, only the amplification products from convertedlibrary were of the expected size (data not shown).

Example 36 Enrichment of Libraries Prepared from Alui Digested GenomicDNA for Promoter Sequences by Heat Treatment

This example demonstrates that libraries prepared from Alul-digested DNAessentially as described in U.S. patent application Ser. No. 10/797,333,filed Mar. 8, 2004, published as U.S. Patent Application PublicationNo.: 2004/0209299 and is now abandoned, can be enriched for promotersequences by pre-heating fragmented DNA prior to library preparation attemperatures that will selectively denature subsets of DNA fragmentsbased on their GC content thus making a fraction of the genomeincompetent for ligation.

Human genomic DNA isolated from the peripheral blood of a healthy donorby standard procedures was digested with 10 units of Alul restrictionendonuclease (NEB) for 1 hour following the manufacturer's protocol.Aliquots of 70 ng were pre-heated in 15 μl of 1× NEBuffer 4 (NEB) for 10min at 75° C. (control), 83° C., 84.1° C., 85.3° C., 87° C., 89.1° C.,91.4° C., 93.5° C., 94.9° C., 96° C., or 97° C. followed by snap-coolingat −10° C. in ice/ethanol bath.

For library preparation, the pre-heated DNA samples were incubated in areaction mixture comprising 1× NEBuffer 4, 1.25 μM of universal K_(U)adaptor (Table VI), 800 units of T4 DNA ligase, and 1 mM ATP in a finalvolume of 21 μl. Ligation was carried out for 1 hour at 25° C. followedby 15 min at 75° C.

Libraries were amplified by quantitative PCR by supplementing thereactions with PCR master mix adding to the following finalconcentrations: 1× Titanium Taq reaction buffer (Clontech), 200 μM ofeach dNTP, fluorescein calibration dye (1:100,000) and SYBR Green I(1:100,000), 1μM universal primer K_(U) (Table VI, SEQ ID NO: 15), 4%DMSO, 200μM 7-deaza-dGTP (Sigma), and 5 units of Titanium Taq polymerase(Clontech) in a final volume of 75 μl. After initial incubation at 75°C. for 15 min to fill-in the recessed 3′ends of the ligated DNAlibraries, amplifications were carried out at 95° C. for 3 min, followedby cycling at 94° C. for 15 sec and 65° C. for 2 min on an I-Cyclerreal-time PCR instrument (Bio-Rad). Amplified libraries were purifiedusing MultiScreen PCR cleanup system (Millipore) and quantified byoptical density reading.

Forty nanograms of purified library DNA were used to analyze promotersequences of high, intermediate, or low GC content by quantitative PCRas exemplified by the GSTP-1, MDR-1, and APC promoters, respectively.Quantitative PCR was performed in reaction mixtures comprising thefollowing: 1× Titanium Taq reaction buffer (Clontech), 200 μM of eachdNTP, 4% DMSO, 0.5 M betaine, FCD (1:100,000) and SYBR Green I(1:100,000), 200 nM each forward and reverse primer (Table VII, SEQ IDNO: 176 and SEQ ID NO: 177 for GSTP-1 promoter, SEQ ID NO: 113 and SEQID NO: 114 for APC-1 promoter, and SEQ ID NO: 139 and SEQ ID NO: 140 forMDR-1 promoter), and 1.5 units of Titanium Taq polymerase (Clontech) ina final volume of 15μl at 95° C. for 3 min followed by 50 cycles at 94°C. for 15 sec and 68° C. for 1 min.

As shown on FIGS. 61A, 61B, and 61C, a complex pattern of temperaturedependent shifts of the amplification curves was observed relative tothe control treatment of 75° C. Temperatures of between 89° C. and 94°C. resulted in enrichment of on average 2 to 7 cycles (4 to 128 fold)for promoter sites of high to intermediate GC content (FIGS. 61A and61B) whereas temperatures between 83° C. and 85° C. resulted in 1-2cycles (2 to 4 times) less efficient amplification. For the lower GCcontent APC-1 promoter site, the optimal temperature for enrichment was91.5° C. resulting in about 8-fold enrichment, whereas highertemperatures caused reduced amplification. For all three promoter sites,pre-heating at about 95° C. to 97° C. caused significant reduction ofcopy number and complete denaturing for the low GC content APC promotersite.

Example 37 Enrichment of Libraries Prepared from Cell-free Urine DNA forPromoter Sequences by Heat Treatment

This example demonstrates that methylome libraries prepared fromcell-free urine DNA can be enriched for promoter sequences bypre-heating prior to library preparation at temperatures that willselectively denature the fraction of DNA having below average GC contentmaking the more easily denatured fragments incompetent for ligation.

Cell-free DNA was isolated and quantified from the urine of a healthydonor as described in Example 31. Aliquots of 22 ng of purified DNA wereeither heat-treated directly or processed for enzymatic repair oftermini with Klenow fragment of DNA polymerase I before heat treatment.

The first set of samples were heated in duplicate directly for 10 min at75° C. (control), 89° C., 91° C., or 93° C. in 13 μl of NEBuffer 4 (NEB)followed by cooling on ice.

The second set of samples were first incubated in 1× NEBuffer 4 (NEB)with 0.4 units of Klenow fragment of DNA polymerase I (USB Corporation),0.1 mg/ml of BSA, and 13.3 μM dNTPs for 15 min at 25° C. followed by 10min at 75 ° C. in a final volume of 15 μl. After polishing, samples wereheated for 10 min at 75° C. (control), 89° C., 91° C., or 93° C.,followed by cooling on ice.

The first set was polished after heating by incubation with 0.4 units ofKlenow fragment of DNA polymerase I (USB Corporation), 0.1 mg/ml of BSA,and 13.3 μM dNTPs for 15 min at 25° C. followed by 10 min at 75° C. in afinal volume of 15 μl.

Both sets of samples were then ligated to universal blunt-end adaptor ina reaction mixture comprising 1.25 μM K_(U) adaptor (Table VI), 800units of T4 DNA ligase, and 1 mM ATP in 1× NEBuffer 4 (NEB) added to afinal volume of 21 Ligations were carried out for 1 hour at 25° C.followed by 15 min at 75° C.

Half of the first set of samples (treated before polishing) wassubjected to digestion with a cocktail of methylation-sensitiverestriction enzymes comprising 5.8 units of Acil and HhaI, and 2.9 unitof BstUI, Hpall, and Hinp1I (NEB) in 1× NEBuffer 4 for 12 hours at 37°C., followed by 2 hours at 60° C. The second half was incubated inparallel but without restriction enzymes (“uncut” controls).

Libraries were amplified by quantitative real-time PCR by supplementingthe reactions with PCR master mix adding to the following finalconcentrations: 1× Titanium Taq reaction buffer (Clontech), 200 μM ofeach dNTP, fluorescein calibration dye (1:100,000) and SYBR Green I(1:100,000), 1 μM universal primer K_(U) (Table VI, SEQ ID NO: 15), 4%DMSO, 200μM 7-deaza-dGTP (Sigma), and 5 units of Titanium Taq polymerase(Clontech) in a final volume of 75 μl. After initial incubation at 75°C. for 15 min to fill-in the recessed 3′ends of the ligated DNAlibraries, amplifications were carried out at 95° C. for 3 min, followedby cycling at 94° C. for 15 sec and 65° C. for 2 min on an I-Cyclerreal-time PCR instrument (Bio-Rad). Amplified libraries were purifiedusing MultiScreen PCR cleanup system (Millipore) and quantified byoptical density reading.

Aliquots of 80 ng of each amplified library were used to analyzepromoter sequences for enrichment by Q-PCR in reaction mixturescomprising: 1× Titanium Taq reaction buffer (Clontech), 200 μM of eachdNTP, 4% DMSO, 0.5 M betaine, FCD (1:100,000) and SYBR Green I(1:100,000), 200 nM each forward and reverse primer (Table VII, SEQ IDNO: 176 and SEQ ID NO: 177 for GSTP-1 promoter, SEQ ID NO: 113 and SEQID NO: 114 for APC-1 promoter, SEQ ID NO: 139 and SEQ ID NO: 140 forMDR-1 promoter, SEQ ID NO: 163 and SEQ ID NO: 164 for CD-44, and TableIX, SEQ ID NO: 186 and SEQ ID NO: 187 for p16 Exon 2), and 1.5 units ofTitanium Taq polymerase (Clontech) in a final volume of 15 μl at 95° C.for 3 min followed by 50 cycles at 94° C. for 15 sec and 68° C. for 1min.

TABLE IX PRIMERS USED FOR ANALYSIS OF p16 EXON 2 PromoterSequence (5′-3′)* P16 (CDKN2A) F CAAGCTTCCTTTCCGTCATGCC (SEQ ID NO: 186)Exon 2 R AGCACCACCAGCGTGTCCA (SEQ ID NO: 187) *F = Forward Primer, R =Reverse Primer

FIG. 62A shows the analysis of four promoter sequences in librariesprepared from samples heated after enzymatic repair (set 2 describedabove). Heat-treatment at 89° C. resulted in maximal enrichment in alltested promoter sites causing a shift between 4 and 7 cycles (16-to128-fold enrichment), whereas heating at 91° C. resulted in enrichmentonly for the GC-rich GSTP-1 promoter but had no effect or resulted indelayed amplification for the rest of the promoters. On the other hand,treatment at 93° C. resulted in significant reduction of the copy numberof all promoter sites analyzed in cell-free urine DNA libraries.

FIG. 62B shows a comparison between heat-treated samples beforeenzymatic repair (set 1 above) with or without subsequent cleavage withmethylation-sensitive restriction enzymes for two CpG islands. As shown,significant enrichment was observed for both CpG islands for librariespre-treated at 89° C. or 91° C. that were not cut with restrictionenzymes. However, no effect of the heat-treatment was found for thesamples that were digested with restriction enzymes when the GSTPpromoter was analyzed indicating that the cleavage was complete for thissite. On the other hand, when a different CpG site reported to beaberrantly methylated in cancer, p16 Exon 2, was analyzed, both cut anduncut samples were enriched in a similar way by the heat-treatment,suggesting that the enzymatic digestion was perhaps incomplete. Insummary, maximal enrichment of promoter sites in libraries prepared fromcell-free urine DNA was obtained after pre-heating at 89° C. to 91° C.

Example 38 One Step Preparation of Methylome Libraries from Cell-freeUrine DNA by Ligation of Degradable Hairpin Adaptor and their Utilityfor Analysis of Promoter Hyper-methylation Following Cleavage withMethylation-sensitive Restriction Enzymes

In this example, a single step preparation of methylome libraries fromcell-free urine DNA is described where a hairpin oligonucleotide adaptorcontaining deoxy-uridine in both its 5′ stem region and in its loop(Table VI, SEQ ID NO: 172) is ligated via its free 3′ end to the 5′phosphates of target DNA molecules in the presence of 3 enzymaticactivities: T4 DNA ligase, DNA polymerase, and Uracil-DNA glycosylase(UDG). UDG catalyses the release of free uracil and creates abasic sitesin the adaptor's loop region and the 5′ half of the hairpin. Thestrand-displacement activity of the DNA polymerase extends the free 3′end of the restriction fragments until abasic site is reached serving asa replication stop. This process results in truncated 3′ ends of thelibrary fragments such that they do not have terminal inverted repeats.The entire process takes place in a single tube in one step and iscompleted in just 1 hour.

Cell-free DNA isolated from urine as described in Example 31 wasartificially methylated by incubating 200 ng DNA in 20 μl of NEBuffer 2(NEB) with 4 units of M.SssI CpG methylase (NEB) in the presence of 160μM SAM for 1 hour at 37° C.

Fifty nanograms of methylated or non-methylated DNA were incubated in 1×NEBuffer 4 (NEB) with 0.7 units of T4 DNA polymerase (NEB), 2 μM ofdU-Hairpin Adaptor (Table VI, SEQ ID NO: 172), 1 unit of uracil-DNAglycosylase (UDG), 800 units of T4 DNA ligase, 40μM dNTPs, 1 mM ATP, and0.1 mg/ml BSA for 1 hour at 37° C. in a final volume of 30 μl. Thesamples were split into 2 equal aliquots and one aliquot was digestedwith 20 units of AciI and HhaI, and 10 units of BstUI, Hpall, and Hinp1I(NEB) for 12 hours at 37° C., followed by 2 hours at 60° C. in 50 μl ofNEBuffer 4. The second aliquot was incubated in parallel but withoutrestriction enzymes (“uncut” control).

Aliquots of 5 ng were amplified by quantitative PCR in a reaction mixcomprising 1× Titanium Taq reaction buffer (Clontech), 200 μM of eachdNTP, fluorescein calibration dye (1:100,000) and SYBR Green I(1:100,000), 1 μM universal primer M_(U)-1 (Table VI, SEQ ID NO: 173),4% DMSO, 200 μM 7-deaza-dGTP (Sigma), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 75 μl. Samples werepre-heated at 72° C. for 15 min followed by 95° C. for 5 min and cyclingat 94° C. for 15 sec and 65° C. for 2 min on an I-Cycler real-time PCRinstrument (Bio-Rad). Amplified libraries were purified usingMultiScreen PCR cleanup system (Millipore) and quantified by opticaldensity reading.

Methylation analysis of promoter sites was performed by real-time PCRusing aliquots of 160 ng of each digested or non-digested amplifiedlibrary DNA incubated in reaction mixtures containing: 1× Titanium Taqreaction buffer (Clontech), 200 μM of each dNTP, 4% DMSO, 0.5 M betaine,FCD (1:100,000) and SYBR Green I (1:100,000), 200 nM each forward andreverse primer (Table VII, SEQ ID NO: 137 and SEQ ID NO: 138 for PTGS-1promoter, SEQ ID NO: 174 and SEQ ID NO: 175 for MDR-1 promoter, SEQ IDNO: 141 and SEQ ID NO: 142 for EDNRB promoter, and Table X, SEQ ID NO:188 and SEQ ID NO: 189), and 1.5 units of Titanium Taq polymerase(Clontech) in a final volume of 15 μl at 95° C. for 3 min followed by 50cycles at 94° C. for 15 sec and 68° C. for 1 min.

TABLE X PRIMERS USED FOR ANALYSIS OF APC-1 PROMOTER PromoterSequence (5′-3′)′)* APC-1 F CTCCCTCCCACCTCCGGCATCT (SEQ ID NO: 188)R CGCTTCCCGACCCGCACTC (SEQ ID NO: 189) *F = Forward Primer, R = ReversePrimer

FIG. 63 shows PCR amplification curves of specific promoter sites fromamplified libraries prepared from methylated or non-methylated urine DNAwith or without cleavage with methylation-sensitive restriction enzymes.As expected, promoter sites from non-methylated cleaved DNA amplifiedwith significant (at least 10 cycles) delay as compared to uncut DNA forall four promoter sites tested. On the other hand, methylated DNA wasrefractory to cleavage.

Example 39 Simplified Protocol Combining Preparation of MethylomeLibraries from Cell-free Urine DNA and Cleavage withMethylation-sensitive Restriction Enzymes in one Step

In this example the preparation of methylome libraries from cell-freeurine DNA by ligation of hairpin oligonucleotide adaptor comprisingdeoxy-uridine as described in Example 38 is combined with thesimultaneous cleavage with a mix of methylation-sensitive restrictionenzymes in a single step.

Cell-free DNA isolated from urine as described in Example 31 wasartificially methylated by incubating 200 ng DNA in 20 μl of NEBuffer 2(NEB) with 4 units of M.SssI CpG methylase (NEB) in the presence of 160μM SAM for 1 hour at 37° C.

Twenty five nanograms of methylated or non-methylated DNA were incubatedin 1× NEBuffer 4 (NEB) comprising 0.35 units of T4 DNA polymerase (NEB),1.5 μM of dU-Hairpin Adaptor (Table VI, SEQ ID NO: 172), 0.5 units ofUDG (NEB), 400 units of T4 DNA ligase (NEB), 30 μM dNTPs, 0.75 mM ATP,75 μg/ml BSA, 16.7 units of AciI, 16.7 units of HhaI, 8.3 units each ofBstUI, HpaII, and Hinp1I (NEB) in a final volume of 20 μl for 1 hour at37° C. A second aliquot of 25 ng of methylated or non-methylated DNA wasincubated in parallel with all the ingredients described above butwithout the restriction enzymes (“uncut” control).

One half of each sample (12.5 ng) was then amplified by quantitative PCRin reaction mix comprising 1× Titanium Taq reaction buffer (Clontech),200 μM of each dNTP, fluorescein calibration dye (1:100,000) and SYBRGreen I (1:100,000), 1μM universal primer M_(U)-1 (Table VI, SEQ ID NO:173), 4% DMSO, 200 μM 7-deaza-dGTP (Sigma), and 5 units of Titanium Taqpolymerase (Clontech) in a final volume of 75 μl. Samples werepre-heated at 72° C. for 15 min followed by 95° C. for 5 min and 12cycles at 94° C. for 15 sec and 65° C. for 2 min on an I-Cyclerreal-time PCR instrument (Bio-Rad). Amplified libraries were purifiedusing MultiScreen PCR cleanup system (Millipore) and quantified byoptical density reading.

Methylation analysis of promoter sites was performed by real-time PCRusing aliquots of 160 ng of each digested or non-digested amplifiedlibrary DNA incubated in reaction mixtures containing: 1× Titanium Taqreaction buffer (Clontech), 200 μM of each dNTP, 4% DMSO, 0.5 M betaine,FCD (1:100,000) and SYBR Green I (1:100,000), 200 nM each forward andreverse primer (Table VII, SEQ ID NO: 176 and SEQ ID NO: 177 for GSTP-1promoter, SEQ ID NO: 174 and SEQ ID NO: 175 for MDR-1 promoter, SEQ IDNO: 141 and SEQ ID NO: 142 for EDNRB promoter, and SEQ ID NO: 178 andSEQ ID NO: 179 for PTGS-2 promoter), and approximately 1.5 units ofTitanium Taq polymerase (Clontech) in a final volume of 15μl at 95° C.for 3 min followed by 50 cycles at 94° C. for 15 sec and 68° C. for 1min.

FIG. 64 shows PCR amplification curves of specific promoter sites inamplified libraries prepared from methylated or non-methylated urine DNAin the presence or in the absence of methylation-sensitive restrictionenzymes. As expected, promoter sites from non-methylated cleaved DNAamplified with significant (at least 10 cycles) delay as compared touncut DNA for all four promoter sites tested. On the other hand,methylated DNA was completely refractory to cleavage. These resultsdemonstrate that the method disclosed in the present invention can beapplied as a simple one-step non-invasive high-throughput diagnosticprocedure for detection of aberrant methylation in cancer.

Example 40 Methylation Detection Sensitivity of Methylome LibrariesPrepared From Dillutions of Lncap Prostate Cancer Cell Line DNA inControl Non-Methylated DNA

This example describes the analysis of methylation sensitivity detectionusing libraries prepared by incorporation of universal sequence andamplification with self-inert primers of DNA from prostate cancer cellline (LnCap) DNA diluted in normal non-methylated DNA following cleavagewith methylation-sensitive restriction enzymes.

Primary Methylome libraries were prepared from 20ng genomic DNA isolatedby standard procedure from LNCaP prostate cancer cell line (CoriellInstitute for Medical Research), and from normal “unmethylated” DNA,(Coriell Institute for Medical Research repository # NA07057), or frommixtures of these two DNAs (see Table 1). Twenty nanogram aliquots ofDNA (0, 0.1, 1, 3, 10 and 100% LnCap mixtures) were pre-heated at 80° C.for 20 min in 92 μl reactions comprising 1× NEBuffer 4 (NEB). Sampleswere cooled to 37° C. for 2 min and split into two PCR tubes. Two μl ofTE buffer were added to the “uncut” tube and 6.7 units each of AciI andHhaI +3.3 units each of BstUI, Hpall, and Hinp1 I (NEB) were added tothe “cut” tube. Samples were then incubated for 12 hours at 37° C.,followed by 2 hours at 60° C. The DNA was precipitated with ethanol inthe presence of 0.3 M sodium acetate and 2 μl of Pellet Paint (Novagen),washed with 75% ethanol, air dried, and resuspended in 10 μl of TE-Lbuffer. Each tube comprising 10 ng of uncut or cut DNA were randomlyfragmented in TE-L buffer by heating at 95° C. for 4 minutes andsubjected to library synthesis. The reaction mixtures comprised 10 ng offragmented DNA in 1× EcoPol buffer (NEB), 200 μM of each dNTP (USB), 200μM of 7-deaza-dGTP (Sigma), 4% DMSO(Sigma), 360ng of Single Stranded DNABinding Protein (USB), and 1μM of K(N)2 primer (SEQ ID NO: 14) in afinal volume of 13 μl. After denaturing for 2 minutes at 95° C., thesamples were cooled to 24° C., and the reactions were initiated byadding 5 units of Klenow Exo-DNA polymerase (NEB). Samples wereincubated at 24° C. for 1 hour and the reactions were terminated byheating for 5 minutes at 75° C. Libraries were then amplified byquantitative real-time PCR in a reaction mixture containing thefollowing final concentrations: 1× Titanium Taq reaction buffer(Clontech), 200 μM of each dNTP (USB), fluorescein calibration dye(1:100,000) (Biorad) and SYBR Green I (1:100,000) (BioWhittakerMolecular Applications), 1μM universal K_(U) primer (SEQ ID NO: 15), 4%DMSO (Sigma), 2001.1M 7-deaza-dGTP (Roche), and 0.5× of Titanium Taqpolymerase (Clontech) in a final volume of 75 μl. Reactions were carriedout at 95° C. for 3.5 min, followed by 14 cycles of 94° C. for 15seconds and 65° C. for 2 minutes on an I-Cycler real-time PCR instrument(Bio-Rad). Amplified libraries were purified using MultiScreen PCRcleanup system (Millipore) and quantified by optical density reading.

TABLE XI DILUTIONS OF LNCap DNA % Cancer DNA LnCap (ng) Control (ng) 10020 0 30 6 14 10 2 18 3 0.6 19.4 1 0.2 19.8 0.1 0.02 19.98 0 0 20

Aliquots of 75 ng of each DNA sample were analyzed by quantitativereal-time PCR in reaction mixtures comprising the following containing:1× Titanium Taq reaction buffer (Clontech), 200μM of each dNTP (USB), 4%DMSO (Sigma), 0.5M Betaine (Sigma), FCD (1:100,000) (Bio-Rad) and SYBRGreen I (1:100,000) (BioWhittaker Molecular Applications), 200 nM eachforward and reverse primer (Table IV, SEQ ID NO: 113 and SEQ ID NO: 114for APC-1 promoter, SEQ ID NO: 30 and SEQ ID NO: 31 for GSTP-1, and SEQID NO: 107 and SEQ ID NO: 108 for BRCA-1 promoter), and 0.5× of TitaniumTaq polymerase (Clontech) in a final volume of 15 μl at 95° C. for 3.5min followed by 50 cycles at 94° C. for 15 sec and 68° C. for 1 min.

FIG. 65 shows the threshold cycle (Ct) difference between cut and uncutmethylome libraries from real time PCR for three promoter primer pairswith various percentages of prostate cell line (LnCap) DNA in thelibraries. If the methylation sensitive restriction enzymes (AciI, HhaI,BstUI, HpaII, and Hinp1I) failed to cut a site between the promoterprimer pairs due to the presence of methylation, the target promotersite would amplify similarly to the uncut library control, and theACt(Cut)-ACt(Uncut) would approach zero. Both the APC1-3 and GSTP1-1gene promoter region primers demonstrated the presence of targetpromoter DNA, and thus protection from methylation-sensitive restrictionenzymes cutting with as little as 1% or less of cancer cell line DNApresent, indicatinga sensitivity detection limit of at least 99%.

Example 41 Amplifiability and Cleavage of Cell-free Urine DNA andMethylome Library Prepared from Cell-free Urine DNA withMethylation-sensitive Restriction Enzymes

This example describes the comparison of amplifiability of promotersites in cell-free urine DNA with that of non-amplified methylomelibrary prepared from cell-free urine DNA with or without cleavage withmethylation-sensitive restriction enzymes.

Cell-free DNA was isolated from urine as described in Example 31. Asample of 50 ng DNA was diluted to 1 ng/μl in 1× NEBuffer 4 and splitinto 2 equal aliquots. One aliquot was digested with 10 units of Aciland HhaI, and 5 units each of BstUI, Hpall, and Hinp1I (NEB) in a finalvolume of 28 μl for 12 hours at 37° C., followed by 2 hours at 60° C. in50 μl of NEBuffer 4. The second aliquot was incubated in parallel butwithout restriction enzymes (“uncut” control).

Another sample of 50 ng DNA was processed for library preparation byincubation in 1× NEBuffer 4 (NEB) with 1.2 units of Klenow fragment ofDNA polymerase I (USB Corporation), 2 μM of dU-Hairpin Adaptor (TableVI, SEQ ID NO: 172), 1 unit of uracil-DNA glycosylase (UDG), 800 unitsof T4 DNA ligase, 40 dNTPs, 1 mM ATP, and 0.1 mg/ml BSA for 1 hour at37° C. in a final volume of The library was diluted to 50 μl in 1×NEBuffer 4 and split into 2 equal aliquots. One aliquot was digestedwith 10 units of AciI and HhaI, and 5 units each of BstUI, HpaII, andHinp1I (NEB) for 12 hours at 37° C., followed by 2 hours at 60° C. in 28IA of NEBuffer 4. The second aliquot was incubated in parallel butwithout restriction enzymes (“uncut” control).

Aliquots of 6.25 ng of both cut and uncut urine DNA and non-amplifiedmethylome library DNA were analyzed for promoter sequences byquantitative PCR in a reaction mix comprisng 1× Titanium Taq reactionbuffer (Clontech), 200 μM of each dNTP, fluorescein calibration dye(1:100,000) and SYBR Green I (1:100,000), 200 nM each forward andreverse primer (Table VII, SEQ ID NO: 176 and SEQ ID NO: 177 for GSTP-1promoter and SEQ ID NO: 113 and SEQ ID NO: 114 for APC-1 promoter), 4%DMSO, 0.5 M betaine (Sigma), and 5 units of Titanium Taq polymerase(Clontech) in a final volume of 75 Samples were pre-heated at 72° C. for15 min followed by 95° C. for 5 min and cycling at 94° C. for 15 sec and68° C. for 2 min on an I-Cycler real-time PCR instrument (Bio-Rad).

FIG. 67 shows amplification curves for two promoter sites. As shown,processing of cell-free DNA through the enzymatic treatments ofmethylome library preparation resulted in: a) improved PCRamplifiability of promoter sites and b) improved cleavage withrestriction enzymes. The amplification of promoter sites improvedbetween 4 and 6 cycles (16 to 64-fold), whereas the cleavage with a mixof restriction enzymes increased from an average of 3 cycles differencebetween digested and non-digested crude urine DNA to between 7 and 10cycles difference between digested and non-digested library DNA. Thus,the fraction of cell-free urine DNA that is in double-strandedconformation after enzymatic treatment is at least 10 times greater thanthe DNA prior to the treatment.

REFERENCES

All patents and publications mentioned in the specification areindicative of the levels of those skilled in the art to which theinvention pertains. All patents and publications are herein incorporatedby reference to the same extent as if each individual publication wasspecifically and individually indicated to be incorporated by reference.

PATENTS

-   PCT WO 99/28498-   PCT WO 00/50587-   PCT WO 03/035860-   PCT WO 03/035860A1-   PCT WO 03/027259A2-   PCT WO 03/025215A1-   PCT WO 03/080862A1-   PCT WO 03/087774 A2-   U.S. Pat. No. 6,214,556-   U.S. Pat. No. 6,261,782-   U.S. Pat. No. 6,300,071-   U.S. Pat. No. 6,383,754-   U.S. Pat. No. 6,605,432-   U.S. Patent Application No. 20010046669-   U.S. Patent Application No. 20030099997A1-   U.S. Patent Application No. 20030232371A1-   U.S. Patent Application No. 20030129602A1-   U.S. Patent Application No. 20050009059A1

PUBLICATIONS

Advances in Immunology, Academic Press, New York.

Annual Review of Immunology, Academic Press, New York.

Akey, D. T., Akey, J. M., Zhang, K., Jin, L. 2002. Assaying DNAmethylation based on high-throughput melting curve approaches. Genomics,80: 376-384.

Akiyoshi, S., Kanada, H., Okazaki, Y., Akama, T., Nomura, K.,Hayashizaki, Y., and Kitagawa, T. 2000. A genetic linkage map of the MSMJapanese wild mouse strain with restriction landmark genomic scanning(RLGS). Mamm. Genome, 11: 356-359.

Anderson, S. 1981. Shotgun DNA sequencing using cloned DNase I-generatedfragments. Nucleic Acids Res., 9: 3015-5027.

Ausubel, F. M., Brent, R., Kingston, R. E., Moore, D. O., Seidman, J.S., Smith, J. A., and Struhl, K. 1987. Current protocols in molecularbiology. Wiley, New York, N.Y.

Badal, V., Chuang, L. S. H., Tan, E. H.-H., Badal, S., Villa, L. L.,Wheeler, C. M., Li, B. F. L., and Bernard, H.-U. 2003. CpG methylationof human papillomavirus type 16 DNA in cervical cancer cell lines and inclinical specimens: genomic hypomethylation correlates with carcinogenicprogression. J Virol., 77: 6227-6234.

Bankier, A. T. 1993. Generation of random fragments by sonication.Methods Mol. Biol., 23: 47059.

Barbin, A., Montpellier, C., Kokalj-Vokac, N., Gibaud, A., Niveleau, A.,Malfoy, B., Dutril-Laux, B., and Boureois, C. A. 1994. New sites ofmethylcytosine-rich DNA detected on metaphase chromosomes. Hum. Genet,94: 684-692.

Baumer, A. 2002. Analysis of the methylation status of imprinted genesbased on methylation-specific polymerase chain reaction combined withdenaturing high-performance liquid chromatography. Methods, 27: 139-143.

Baylin, S. B., and Herman, J. G. 2000. DNA hypermethylation intumorigenesis: epigenetics joins genetics. Trends Genet. 16: 168-174.

Bodenteich, A., Chissoe, S. L., Wang, Y.-F., and Roe, B. A. 1994.Shotgun doing or the strategy of choice to generate template forhigh-throughput dideoxynucleotide sequencing. In: Automated DNAsequencing and analysis (ed. M. D. Adams, C. Fields, and J. C. Venter),pp.42-50. Academic Press, San Diego, Calif.

Branum, M. E., Tipton, A. K., Zhu, S., and Que, L.Jr. 2001.Double-strand hydrolysis of plasmid DNA by dicerium complexes at 37degrees C. J. Am. Chem. Soc., 123: 1898-1904.

Burn, N., and Chaubert, P. 1999. Complex methylation patterns analyzedby single-strand conformation polymorphism. Biotechniques, 26: 232-234.

Cedar, H., Soage, A., Glaser, G., and Razin, A. 1979. Direct detectionof methylated cytosine in DNA by use of the restriction enzyme Mspl.Nucleic Acids Res., 6: 2125-2132.

Champoux J. J. (2001) DNA topoisomerases: structure, function, andmechanism Annu Rev Biochem, 70: 369-413.

Chen, C.-M., Chen, H.-L., Hsiau, T. H.-C., Hsiau, A. H.-A., Shi, H.,Brock, G. J. R., Wei, S. H., Caldwell, C. W., Yan, P. S., and Huang, T.H.-M. 2003. Methylation Target Array for Rapid Analysis of CpG. IslandHypermethylation in Multiple Tissue Genomes. Am J Pathol, 163: 3745.

Chotai, K. A. and Payne, S. J. 1998. A rapid, PCR based test fordifferential molecular diagnosis of Prader-Willi and Angelman syndromes.J Med Genet. 35: 472-5.

Coligan, J. E., Kruisbeek A. M., Margulies, D. H., Shevach, E. M.,Strober, W. 1991. Current protocols in immunology. John Wiley and Sons,Hoboken, N.J.

Collela, S., Shen, L., Baggerly, K. A., Issa, J.-P. J., and Krahe, R.2003. Sensitive and quantitativeuniversal Pyrosequencine^(m) methylationanalysis of CpG sites. Biotechniques, 34: 146-150.

Cottrell, S. E., Distler, J., Goodman, N. S., Mooney, S. H., Kluth, A.,Olek, A., Schwope, I., Tetzner, R., Ziebarth, H., and Berlin, K. 2004. Areal-time PCR assay for DNA-methylation using methylation-specificBlockers. Nucleic Acids Res., 32: e10.

Dryden, D. T. F., Murray, N. E., and Rao, D. N. 2001. Nucleosidetriphosphate-dependent restriction enzymes. Nucleic Acids Res., 29:3728-3741.

Dunn, B. K. 2003. Hypomethylation: one side of a larger picture. Ann.N.Y. Acad. Sci., 983: 28-42.

Duthie, S. J., Narayanan, S., Blum, S., Pine, L., and Brand, G. M. 2000.Folate deficiency in vitro induces uracil misincorporation and DNAhypomethylation and inhibits DNA excision repair in immortalized normalhuman colon epithelial cells. Nutr. Cancer, 37: 245-251.

Eads, C. A., Danenberg, K. D., Kawakami, K., Saltz, L. B., Blake, C.,Shibata, D., Danenberg, P. V., and Laird, P. W. 2000. MethyLight: ahigh-throughput assay to measure DNA methylation. Nucleic Acids Res.,28: E32.

Fanning, T. G., Hu, W. S., and Cardiff, R. D. 1985. Analysis oftissue-specific methylation patterns of mouse mammary tumor virus DNA bytwo-dimensional Southern blotting. J. Virol., 54: 726-730.

Fraga, M. F. and Esteller, M. 2002. DNA Methylation: A profile ofmethods and applications. Biotechniques, 33: 632-649.

Fraga, M. F., Rodriquez, R., and Canal, M. J. 2000. Rapid quantificationof DNA methylation by high performance capillary electorphoresis.Electrophoresis, 21: 2990-2994.

Franklin, S. J. 2001. Lanthanide-mediated DNA hydrolysis. Curr. Opin.Chem. Biol., 5: 201-208.

Freshney, R. I. 1987. Culture of animal cells: a manual of basictechnique, 2d ed., Wiley-Liss, London.

Friso, S., Choi, S. W., Dolnikowski, G. G., and Selhub, J. 2002. Amethod to assess genomic DNA methylation using high-performance liquidchromatography/electrospray ionization mass spectrometry. Anal. Chem.,74: 4526-4531.

Frommer, M., McDonald, L. E., Millar, D. S., Collis, C. M., Watt, F.,Grigg, G. W., Molloy, P. L., and Paul, C. L. 1992. A genomic sequencingprotocol that yields a positive display of 5-methylcytosine residues inindividual DNA strands. Proc. Natl. Acad. Sci. USA, 89: 1827-1831.

Fruhwald, M. C. and Plass, C. 2002. Global and gene-specific methylationpatterns in cancer: aspects of tumor biology and clinical potential. MolGenet Metabol, 75: 1-16.

Furiuchi, Y., Wataya, Y., Hayatsu, H., and Ukita, T. 1970. Chemicalmodification of tRNA-Tyr-yeast with bisulfite. A new method to modifyisopentuladenosine residue. Biochem. Biophys. Res. Commun., 41:1185-1191.

Gait, M. 1984. Oligonucleotide Synthesis. Practical Approach Series. IRLPress, Oxford, U.K.

Gingrich, J. C., Boehrer, D. M., Basu, S. B. 1996. Partial CviJIdigestion as an alternative approach to generate cosmid sublibraries forlarge-scale sequencing projects. Biotechniques, 21: 99-104.

Gonzalgo, M. L., and Jones, P. A. 1997. Rapid quantitation ofmethylation differences at specific sites using methylation-sensitiesingle nucleotide primer extension (Ms-SNuPE). Nucleic Acids Res., 25:2529-2531.

Gonzalgo, M. L., Liang, G., Spruck, C. H. 3^(rd), Zingg, J. M., Rideout,W. M. 3rd, and Jones, P. A. 1997. Identification and characterization ofdifferentially methylated regions of genomic DNA bymethylation-sensitive arbitrarily primed PCR.

Guldberg, P., Worm, J., and Gronbaek, K. 2002. Profiling DNA methylationby melting analysis. Methods, 27: 121-127.

Grunau, C., Clark, S. J., and Rosenthal, A. 2001. Bisulfite genomicsequencing: systematic investigation of critical experimentalparameters. Nucleic Acids Res., 29: E65.

Hayashizaki, Y., Hirotsune, S., Okazaki, Y., hatada, I., Shibata, H.,Kawai, J., Hirose, K., Watanabe, S., Fushiki, S., Wada, S., et al. 1993.Restriction landmark genomic scanning method and its variousapplications. Electrophoresis, 14: 251-258.

Hayes, J. J., Kam, L., and Tullius, T. D. 1990. Footprinting protein-DNAcomplexes with gamma-rays. Methods Enzymol. 186: 545-549.

Herman, J. G., Graff, J. R., Myohanen, S., Nelkin, B. D., and Baylin, S.B. 1996. Methylation-specific PCR: a novel PCR assay for methylationstatus of CpG islands. Proc. Natl. Acad. Sci. USA, 93: 9821-9826.

Jain, P. K. 2003. Epigenetics: the role of methylation in the mechanismof action of tumor supressor genes. Ann. N.Y. Acad. Sci., 983: 71-83.

Jones, P. A., and Baylin, S. B. 2002. The fundamental role of epigeneticevents in cancer. Nat. Rev. Genet., 3: 415-428.

Kaneda, A., Takai, D., Kaminishi, M., Okochi, E., and Ushijima, T. 2003.Methylation-sensitive representational difference analysis and itsapplication to cancer research. Ann. N.Y. Acad. Sci., 983: 131-141.

Komiyama, M., and Sumaoka, J. 1998. Progress towards synthetic enzymesfor phosphoester hydrolysis. Curr. Opin. Chem. Biol., 2: 751-757.

Lippman, Z., Gendrel, A.-V., Colot, V., and Martiensen, R. 2005.Profiling DNA methylation patterns using genomic tiling microarrays.Nature Methods 2: 219-224.

Matin, M. M., Baumer, A., and Hornby, D. P. 2002. An analytical methodfor the detection of methylation differences at specific chromosomalloci using primer extension and ion pair reverse phase HPLC. Hum.Mutat., 20: 305-311.

Matsuyama, T., Kimura, M. T., Koike, K., Abe, T., Nakano, T., Asami, T.,Ebisuzaki, T., Held, W. A., Yoshida, S., and Nagase, H. 2003. Globalmethylation screening in the Arabadopsis thaliana and Mus musculasgenome: applications of virtual image restriction landmark genomicscanning (Vi-RLGS). Nuc. Acids Res. 31: 4490-4496.

Methods in Enzymology. Academic Press, New York.

Miller, J. M., and Calos, M. P. 1987. Gene Transfer Vectors forMammalian Cells. Cold Spring Harbor Laboratory, Cold Spring Harbor.

Miller, 0.J., Schnedl, W., Allen, J., and Erlander, B. F. 1974.5-methylcytosine localized in mammalian constitutive heterochromatin.Nature, 251: 636-637.

Nouzova M, Holtan N, Oshiro M M, Isett R B, Munoz-Rodriguez J L, List AF, Narro M L, Miller S J, Merchant N C, Futscher B W 2004. Epigenomicchanges during leukemia cell differentiation: analysis of histoneacetylation and cytosine methylation using CpG island microarrays. JPharmacol Exp Ther 311: 968-981.

Oakeley, E. J., Schmitt, F., and Jost, J. P. 1999. Quantification of5-methylcytosine in DNA by the chloroacetaldehyde reaction.BioTechniques, 27: 744-752.

Oefner, P. J., Hunicke-Smith, S. P., Chiang, L., Dietrich, F., Mulligan,J. And Davis, R. W. 1996. Efficient random subcloning of DNA sheared ina recirculating point-sink flow system. Nucleic Acids Res., 24:3879-3886.

Panne, D., Raleigh, E. A., and Bickle, T. A. 1999. The McrBCendonuclease translocates DNA in a reaction dependent on GTP hydolysis.J. Mol. Biol., 290: 49-60.

Peraza-Echeverria, S., Herrera-Valencia, V. A., and James-Kay, A. 2001.Detection of DNA methylation changes in micropropogated banana plantsusing methylation-sensitive amplification polymorphism (MSAP). PlantSci., 161: 359-367.

Price, M. A., and Tullius, T. D. 1992. Using hydroxyl racidal to probeDNA structure. Methods Enzymol., 212: 194-219.

Pogibny, I., Ping, Y., and James, S. J. 1999. A sensitive new method forrapid detection of abnormal methylation patterns in global DNA andwithin CpG islands. Biochem Biophys Res Comm, 262: 624-628.

Ramsahoye, B. H. 2002. Measurement of genome wide DNA methylation byreversed-phase high-performance liquid chromatography. Methods, 27:156-161.

Rand K., Qu, W., Ho, T., Clark, S. J., Molloy, P. 2002.Conversion-specific detection of DNA methylation using real-timepolymerase chain reaction (ConLight-MSP), to avoid false positives.Methods, 27: 114-120.

Richards, O. C., and Boyer, P. D., 1965. Chemical mechanism of sonic,acid, alkaline and enzymatic degradation of DNA. J. Mol. Biol. 11:327-340.

Roots, R., Holley, W., Chatterjee, A., Rachal, E., and Kraft, G. 1989.The influence of radiation quality on the formation of DNA breaks. Adv.Space Res., 9: 45-55.

Rouillard, J. M., Erson, A. E., Kuick, R., Asakawa, J., Wimmer, K.,Muleris, M., Petty, E. M., and Hannah, S. 2001. Virtual genome scan: atool for restriction landmark-based scanning of the human genome. GenomeRes. 11: 1453-1459.

Sadri, R., and Hornsby, P. J. 1996. Rapid analysis of DNA methylationusing new restriction enzyme sites created by bisulfite modification.Nucleic Acids Res., 24: 5058-5059.

Sambrook, J., Fritsch, E. F. and Maniatis, T. (1989) Molecular Cloning:A Laboratory Manual, second edition, Cold Spring Harbor Laboratory, ColdSpring Harbor.

Sasaki, M., Anast, J., Bassett, W., Kawakami, T., Sakuragi, N., andDahiya, R. 2003. Bisulfite conversion-specific and methylation-specificPCR: a sensitive technique for accurate evaluation of CpG methylation.Biochem. Biophys. Res. Commun., 309: 305-309.

Steigerwald, S. D., Pfeifer, G. P., and Riggs, A. D. 1990.Ligation-mediated PCR improves the sensitivity of methylation analysisby restriction enzymes and detection of specific DNA strand breaks.Nucleic Acids Res., 18: 1435-1439.

Stewart, F. J., and Raleigh, E. A. 1998. Dependence of McrBC cleavage ondistance between recognition elements. Biol. Chem., 379: 611-616.

Studier, F. W. 1979. Relationships among different strains of T7 andamong T7-related bacteriophages. Virology 95: 70-84.

Sutherland, E., Coe, L., and Raleigh, E. A. 1992. McrBC: a multisubunitGTP-dependent restriction endonuclease. J. Mol. Biol., 225: 327-348.

Suzuki, H., Itoh, F., Toyota, M., Kikuchi, T., Kakiuchi, H., Hinoda, Y.,and Imai, K. 2000. Quantitative DNA methylation analysis by fluorescentpolymerase chain reaction single strand conformation polymorphism usingand automated DNA sequencer. Electrophoresis, 21: 904-908.

Tawa, R., Tamura, G., Sakurai, H., Ono, T., and Kurishita, A. 1994.High-performance liquid chromatographic analysis of methylation changesof CCGG sequence in brain and liver DNA of mice during pre- andpostnatal development. J. Chromatogr. B Biomed. Appl., 653: 211-216.

Thorstenson, Y. R., Hunicke-Smith, S. P., Oefner, P. J., and Davis, R.W. 1998. An automated hydrodynamic process for controlled, unbiased DNAshearing. Genome Res., 8: 848-855.

Tost, J., Dunker, J., and Gut, I. G. 2003. Analysis and quantificationof multiple methylation variable positions in CpG islands byPyrosequencing. Biotechniques, 35: 152-156.

Tost, J., Schatz, P., Schuster, M., Berlin, K., and Gut, I. G. 2003.Analysis and accurate quantification of CpG methylation by MALDI massspectrometry. Nucleic Acids Res., 31: e50.

Toyota, M., and Issa, J. P. 2002. Methylated cpG island amplificationfor methylation analysis and cloning differentially methylatedsequences. Methods Mol. Biol., 200: 101-110.

Toyota, M., Ho, C., Ahuja, N., Jair, K.-W., Li, Q., Ohe-Toyota, M.,Baylin, S. B., and Issa, J.-P. J. 1999. Identification of differentiallymethylated sequences in colorectal cancer by methylated CpG islandamplification. Cancer Res., 59: 2307-2312.

Tullius, T. D. 1991. DNA footprinting with the hydroxyl racidal. FreeRadic. Res Commun., 12-13: 521-529.

Ushijima, T., Morimura, K., Hosoya, Y., Okonogi, H., Tatematsu, M.,Sugimura, T., and Nagao, M. 1997. Establisment ofmethylation-sensitive-representational difference analysis and isolationof hypo- and hypermethylated genomic fragments in mouse liver tumors.Proc. Natl. Acad. Sci. USA, 94: 2284-2289.

Weir, D. M. 1978. Handbook of Experimental Immunology. BlackwellScientific Publications, Oxford, U.K.

Wold, M S (1997) Replication protein A: A heterotrimeric,single-stranded DNA-binding protein required for eukaryotic DNAmetabolism. Ann. Rev. Biochem. 66: 61-92.

Xiong, Z., and Laird, P. W. 1997. COBRA: a sensitive and quantitativeDNA methylation assay. Nucleic Acids Res,. 25: 2532-2534.

Yan, P. S., Chen, C-M, Shi, H., Rahmatpanah, F., Wei, S. H., Caldwell,C. W., and Huang T. H-M. 2001. Dissecting Complex Epigenetic Alterationsin Breast Cancer Using CpG Island Microarrays. Cancer Research 61:8375-8380.

Yuan, Y., SanMiguel, P. J., and Bennetzen, J. L. 2002.Methylation-spanning linker libraries link gene-rich regions andidentify epigenetic boundaries in Zea mays. Genome Res., 12: 1345-1349.

Although the present invention and its advantages have been described indetail, it should be understood that various changes, substitutions andalterations can be made herein without departing from the spirit andscope of the invention as defined by the appended claims. Moreover, thescope of the present application is not intended to be limited to theparticular embodiments of the process, manufacture, composition ofmatter, means, methods and steps described in the specification. As oneof ordinary skill in the art will readily appreciate from the disclosureof the present invention, processes, manufacture, compositions ofmatter, means, methods, or steps, presently existing or later to bedeveloped that perform substantially the same function or achievesubstantially the same result as the corresponding embodiments describedherein may be utilized according to the present invention. Accordingly,the appended claims are intended to include within their scope suchprocesses, machines, manufacture, compositions of matter, means,methods, or steps.

1-73. (canceled)
 74. A method for preparing a DNA molecule, the methodcomprising: a) providing a sample comprising methylated nucleic acidsand non-methylated nucleic acids; b) generating a nucleic acid libraryfrom the methylated nucleic acids and the non-methylated nucleic acidsby attaching hairpin adaptors; c) digesting the nucleic acid librarywith a methylation-sensitive restriction enzyme; and b) detectingmethylation sites within nucleic acids of the digested nucleic acidlibrary; wherein the detecting comprises detecting methylated nucleicacids that comprise less than 10% of the nucleic acids in the sample.75. The method of claim 74, wherein the detecting comprises detectingmethylated nucleic acids that comprise less than 1% of the nucleic acidsin the sample.
 76. The method of claim 74, wherein the detectingcomprises detecting methylated nucleic acids that comprise less than0.1° A of the nucleic acids in the sample.
 77. The method of claim 74,wherein the detecting comprises detecting methylated nucleic acids thatcomprise from 10% to 0.01% of the nucleic acids in the sample.
 78. Themethod of claim 74, wherein the nucleic acids are from serum DNA. 79.The method of claim 74, wherein the nucleic acids are from circulatingcell-free DNA.
 80. The method of claim 74, wherein the nucleic acids arefrom an apoptosed cell.
 81. The method of claim 74, wherein the nucleicacid are from a source selected from the group consisting of: biopsymaterials, pap smears, serum, and plasma, and any combinations thereof.82. The method of claim 74, wherein the attaching hairpin adaptorscomprises ligating.
 83. The method of claim 82, wherein the ligatingcomprises ligating a first adaptor comprising a known sequence and anonblocked 3′ end to an end of the digested fragments to produce anadaptor-linked molecule, wherein the 5′ end of the digested fragmentsare attached to the nonblocked 3′ end of the first adaptor, leaving anick site between a juxtaposed 3′ end of the digested fragments and a 5′end of the first adaptor.
 84. The method of claim 83, further comprisingextending the juxtaposed 3′ end of the digested fragments from the nicksite by polymerization.
 85. The method of claim 74, wherein thedetecting comprises using a detection method selected from the groupconsisting of: sequencing, quantitative real-time PCR, ligation chainreaction, ligation-mediated PCR, probe hybridization, probeamplification, and microarray hybridization, and any combinationsthereof.
 86. The method of claim 74, wherein the generating comprisesgenerating a library from 1 to 100 ng of said digested fragments. 87.The method of claim 74, wherein the method is performed in a singletube.
 88. The method of claim 74, wherein the methylation sensitiverestriction enzyme comprises one or more methylation sensitiverestriction endonucleases selected from the group consisting of: McrBC,Aci I, Bst UI, Hha I, HinP1, Hpa II, Hpy 991, Ava I, Bce AI, Bsa HI, BsiE1, and Hga I, and any combinations thereof.
 89. A method for preparinga DNA molecule, the method comprising: a) providing a sample comprisingmethylated nucleic acids and non-methylated nucleic acids; b) digestingthe methylated and non-methylated nucleic acids with amethylation-sensitive restriction enzyme, thereby creating digestedfragments; c) generating a library from the digested fragments; and d)detecting methylation sites within the library; wherein said detectingcomprises detecting methylated nucleic acids that comprise less than 10%of the nucleic acids in said sample.
 90. The method of claim 89, whereinthe detecting comprises detecting methylated nucleic acids that compriseless than 1% of the nucleic acids in the sample.
 91. The method of claim89, wherein the detecting comprises detecting methylated nucleic acidsthat comprise less than 0.1° A of the nucleic acids in the sample. 92.The method of claim 89, wherein the detecting comprises detectingmethylated nucleic acids that comprise from 10% to 0.01% of the nucleicacids in the sample.
 93. The method of claim 89, wherein said nucleicacids are from serum DNA.
 94. The method of claim 89, wherein thenucleic acids are from circulating cell-free DNA.
 95. The method ofclaim 89, wherein the nucleic acids are from an apoptosed cell.
 96. Themethod of claim 89, wherein the nucleic acids are from a source from thegroup consisting of: biopsy materials, pap smears, serum, and plasma,and any combinations thereof.
 97. The method of claim 89, wherein thegenerating of the library comprises incorporating a nucleic acidmolecule into at least some of the digested fragments to provide firstmodified DNA molecules, by incorporating at least one primer from aplurality of primers, said primers comprising a 5′ constant sequence anda 3′ variable sequence that is substantially non-self-complementary andsubstantially non-complementary to other primers in the plurality,wherein the sequence of the constant and variable regions consistsessentially of only two types of non-complementary nucleotides selectedfrom the group consisting of adenines and guanines; adenines andcytosines; guanines and thymidines; and cytosines and thymidines, suchthat the primers of the population will not cross-hybridize orself-hybridize under amplification conditions.
 98. The method of claim97, further comprising amplifying one or more of the first modified DNAmolecules to provide amplified first modified DNA molecules.
 99. Themethod of claim 89, wherein the generating of the library comprisesligating hairpin adaptors to the digested fragments.
 100. The method ofclaim 89, wherein the detecting comprises using a detection methodselected from the group consisting of: sequencing, quantitativereal-time PCR, ligation chain reaction, ligation-mediated PCR, probehybridization, probe amplification, and microarray hybridization, andany combinations thereof.
 101. The method of claim 89, wherein thegenerating comprises generating a library from 1 to 100 ng of thedigested fragments.
 102. The method of claim 89, wherein the method isperformed in a single tube.
 103. The method of claim 89, wherein themethylation sensitive restriction enzyme comprises one or moremethylation sensitive restriction endonucleases selected from the groupconsisting of: McrBC, Aci I, Bst UI, Hha I, HinP1, Hpa II, Hpy 991, AvaI, Bce AI, Bsa HI, Bsi E1, and Hga I, and any combinations thereof.